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Ab stract 



Novel metalloproteases having thrombospondin domain(s) (MPTS proteins) 
and polypeptides related thereto, as well as nucleic acid compositions encoding the 
same, are provided. The subject polypeptide and nucleic acid compositions find use 
in a variety of applications, including diagnostic applications, therapeutic agent 
screening applications, as well as therapeutic applications for a variety of different 
conditions. Also provided are methods of treating disease conditions associated with 
aggrecana^e activity, e.g. conditions characterized by the presence o: aggrecan 
cleavage products, such as rheumatoid- and osteo-arthntis. 
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The field of the invention is proteases, particularly metalloproteases with 
thrombospondin domains. 

Cartilage matrix structure as dry weight of the tissue is made up of 70% collagen 
5 and 20-30% proteoglycans. The proteoglycan component confers mechanical flexibility 
to load bearing tissues and imparts viscoelastic properties to cartilage. Its loss leads to 
rapid structural damage as is seen most frequently in arthritic joint diseases and joint 
injury. 

1 0 Aggrecan is a major cartilage proteoglycan. Aggrecan is a large protein of 2 10 

kDa and has three globular domains: Gl, G2, and G3. The Gl and G2 domains of the 
pr otein are closer to the amino terminus of the protein and their intervening 
interglobular domain has sites that are proteolytically sensitive. The region between G2 
and G3 is heavily glycosylated and connected to oligosaccharides and 

15 glycosaminoglycans (GAGs) to form the mature proteoglycan. In arthritic cartilage, 
core protein fragments of 55 kDa are observed and believed to be the result of cleavage 
of the core protein in the Gl and G2 interglobular domain between asparagine 341 and 
phenylalanine 342. This cleavage can be made by many matrix metalloproteinases e.g. 
MMP-1, -2, -3, -7, -8, -9, and -13. In addition, 60 kDa aggrecan fragments with a - 

2o COOH terminus of glutamic acid are also identified and are indicative of a cleavage site 
between glutamic acid 373 and alanine 374. Matrix metalloproteinase are unable to 
cleave at this site. The unique endopeptidase activity responsible tor this cleavage has 
been termed 'aggrecana >e." 

: - The G 1 domain ■.Tt.he core protein forms a stabl : ternary complex by binding to 

hyaluronic acid and linl proteins in the matrix. Any enzymatic cleavage in this region 
Je>tabilr/es the cartilage matrix structure, leads to the loss of the major proteoglycan 
aggrecan and exposes type II collagen to collagenases, causing cartilage loss and the 
consequent development of joint disease. Since a variety of anti-arthritic drugs do not 
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As such, aggrecanase is considered to be an important drug target for arthritis. 
Aggrecan fragments released into the synovial fluid are the primary detectable events in 
the development of rheumatoid- and osteo- arthritis. Search for this protease has been 
intense. Despite these intense discovery efforts, identification of human aggrecanase has 
5 remained elusive. 

As such, there is much interest in the identification of human aggrecanase, as 
well as the gene encoding this activity. 

10 The following references are directed to this field. U.S. Patents 5,872,209 and 

5.427,954, and WO 99/09000; WO 98/55643; WO 98/51665. and WO 97/18207. 

Other references include: Abbasdale, "Cloning and characterization of 
ADAM IS 1 1, an aggrecanase from the ADAMTS family;' J. Biol. (/hem. (Aug. 1999) 

1 5 274: 23443-50; Arner et al., "Generation and Characterization of Aggrecanase. A 
soluble, cartilage-derived aggrecan-degrading activity," J Biol Chem (1999 Mar 5) 
2 7-K 10);6594-660l ; Arner et al., "Cytokine-induced cartilage proteoglycan degradation 
i> mediated by aggrecanase Osteoarthritis Cartilage ( 1998 May) o- 5):2 14-28; 
Billington et al., "An aggrecan-degrading activity associated with chondrocyte 

20 membranes/ 1 Biochem I ( I 998 Nov 15 ) 336 ( Ft 1 ):207- 12; Buttner et at., "Membrane 
tvpc 1 matrix metalloprotemase (MT1-MMP.) cleaves the recombinant aggrecan 
substrate rAgglmut at the 'aggrecanase' and the MMP sites. Characterization ot M i l 
MMP catabolic activities on the interglobular domain of aggrecan," Biochem I [ 1998 Jul 
1 ) 3 3 3 ( Pt 1 ): 3 59-65; Flannery et al., "Expression of ADAMTS homologues in articular 

25 cartilage, 1 ' Biochem. Biophys. Res. Commun, duly 1999) 260:318-22; Plurskamen et al., 
" ADAM-TS5, ADAM-TS6, and ADAM-TS7, Novel members of a New Family of Zinc 
Metalloproteases," J. Biol. Chem. (Sept. 1999) 274: 25555-25563Hughes et al., 
"Differential expression of aggrecanase and matrix metalloproteina.se activity in 
chondrocytes isolated from bovine and porcine articular cartilage," I Biol Chem ( 1998 

30 Nov 13) 273(46):30576-82; llic et al., "Characterization of aggrecan retained and lost 
trom the extracellular matrix of articular cartilage. Involvement ot carboxyl-terrninal 
processing in the catabolism of aggrecan," J Biol Chem { 1 998 Jul 10) 273(28):17451-S; 
Kuno et al., '"ADAMTS- 1 is an active metalloproteinase associated with the extracellular 
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matrix," J. Biol. Chem. (June 1999) 274: 1882 1-6; Kuno et al, "ADAMTS-1 protein 
anchors at the extracellular matrix through the thrombospondin type I motifs and its 
spacing region," J. Biol. Chem. (May 1998) 273:13912-7; Kuno et al., tl The exon/intron 
organization and chromosomal mapping of the mouse ADAMTS-1 gene encoding an 
5 ADAM family protein withTSP motifs/' Genomics (Dec. 1997) 46:466-71 ; Kuno et al., 
'Molecular cloning of a gene encoding a new type of metalloproteinase-disintegrin 
family protein with thombospondin motifs as an inflammation associated gene," J. Biol. 
Chem. (Jan. 1997) 272: 556-62; Sandy et ah, "C hondrocyte-mediated catabolism of 
aggrecan: aggrecanase-dependent cleavage induced by interleukin- 1 or retinoic acid can 

10 be inhibited by glucosamine," Biochem J ( 199b'. Oct 1) 335 ( Pt l):59-66; Tang & Hong, 
"ADAMTS: a novel family of proteases with ADAM protease domain and 
thrombospondin 1 repeats," FHBS Lett. (Teh. 1999) 445:223-5; Tortorella et al., 
Purification and cloning of aggrecanase- 1 : a member of the ADAMTS family of 
proteins," Science (June 1999) 284:1664-6; Vankemmelbcke et al, "Coincubation of 

1 5 bovine synovial or capsular tissue with cartilage generates a soluble 'Aggrecanase' 

activity," Biochem Biophys Res Commun ( 1999 Feb 24) 255(3):686-91 ; and Vasquez et 
al., "MtTH-l, a human ortholog of ADAM IS 1, and METH-2 are members of a new 
family ot proteins with angio-mhibilory activity," J. Biol. Chem. I Aug. 1999) 274:23349- 
57. 

2d 

The present invention is directedo to novel metalloproteases having 
thrombospondin domain(s) (MPTS proteins) .mcl polypeptides related thereto, as well 
as nucleic acid compositions encoding the same, are provided. The subject polypeptide 
and nucleic acid compositions !md use in a vai iely oi applications, including diagnostic 
2a applications, therapeutic agent screening apph.atior.x, as well as therapeutic 

applications for a variety of different conditions. Also provided are methods of treating 
disease conditions associated with aggrecanase activity, e.g. conditions characterized by 
the presence of aggrecan cleavage products, such as rheumatoid- and osteo-arthritis. 
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of MPTS - 1 5. Figure 1C provides an alignment of the amino acid sequence of the 
subject MPTS- 15 with the amino acid sequence of ADAMTS-6, a sequence disclosed 
in Hurskainen et al., J. Biol. Chem. (Sept. 1999) 274: 25555-25563. 

Figure 2A provides the sequence of a nucleic acid that encodes MPTS- 10, an 
5 MPTS protein of the subject invention. Figure 2B provides the amino acid sequence 
of MPTS- 10. 

Figure 3A provides the sequence of a nucleic acid that encodes MP TS- 19, an 
MPTS protein of the subject invention. Figure 3B provides the amino acid sequence 
of MPTS- 19. 

1 0 Figure 4A provides the sequence of a nucleic acid that encodes MPTS-20, an 

MPTS protein of the subject invention. Figure 4B provides the amino acid sequence 
of MPTS-20. 

Novel MPTS proteins and polypeptides related thereto, as well as nucleic acid 
1 5 compositions encoding the same, are provided. The subject polypeptide and/or nucleic 
acid compositions find use in a variety of different applications, including research, 
diagnostic, and therapeutic agent screening/discovery/ preparation applications. Also 
provided are methods of treating disease conditions associated with MPTS, including 
aggrecanase, function, e.g. diseases characterized by the presence of aggrecan cleavage 
21 1 products such as rheumatoid- and osteo-arthritis. 

Novel metalloproteases having thrombospondin domainfsi (also known as 
MP 1 S proteins, ADAMTS proteins or aggrecanase proteins! , as well as polypeptide 
compositions related thereto, are provided. The term polypeptide composition as used 

23 herein refers to both the full length protein, as well as portions or fragments thereof. 
Also included in this term are variations of the naturally occurring human protein, 
where such variations are homologous or substantially similar to the naturally occurring 
protein, as described in greater detail below. In the following description of the subject 
invention, the term "MPTS" is used to refer not only to the specific human MPTS 

3D proteins disclosed herein (i.e. MPTS-10; MPTS-15; MPTS- 19 and MPTS-20), but also 
to homologs thereof expressed in non-human species, e.g. murine, rat and other 
mammalian species. 
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Specific human MPTS proteins of interest are MPTS-15, MPTS- 10, MPTS-19 
and MPTS-20. MPTS-15 has an amino acid sequence as shown in Fig. 115 and identified 
as SEQ ID NO:01. MPTS-10 has an amino acid sequence as shown in Fig. 2B and 
identified as SEQ ID NO:03. MPTS-19 has an amino acid sequence as shown in Fig. 3B 
5 and identified as SEQ ID NO:05. MPTS-20 has an amino acid sequence as shown in Fig. 
4B and identified as SEQ ID NO:07. The subject MPTS proteins have a molecular 
weight based on their amino acid sequence of at least about 90 kDal, where the 
molecular weight based on the amino acid sequence may be substantially higher in 
certain embodiments. The true molecular weight of the subject MPTS proteins may vary 
10 due to glycosylation and/or other postradiational modifications. 

Also provided by the subject invention are MPTS polypeptide compositions. 
The term polypeptide composition as used herein refers to both the full length proteins 
as well as portions or fragments thereof. Also included in this term are variations of the 

1 3 naturally occurring proteins, where such variations are homologous or substantially 

similar to the naturally occurring protein, be the naturally occurring protein the human 
protein, mouse protein, or protein from some other species which naturally expresses 
an MPTS protein, usually a mammalian species. A candidate homologous protein ts 
substantially similar to an MPTS protein of the subject invention, and therefore is an 

20 MPTS protein of the subject invention, if the candidate protein has a sequence that has 
at least about 35'.'u, usually at least about 4fA'n and more usually at least about n0% 
sequence identitv with an MP TS protein, as determined using MegAhgn, PNAstar 
i A^S> clustal algorithm as described in P. <j. 1 ligg.ns and P.M. Sharp, "Fa .t an.! 
Vrisitive multiple Sequence Alignments on a M icr< k i >mpu'.er," : . 1 C >S L J ; t AHU )S, A 131- 
i 33. (Parameters used arc ktu pie 1, gpa penalty 3, window, 3 and diagonal-, saved 3 ;. in 
the following description of the subject invention, the term "MPTS -protein" i< used to 
reter not only to the human MPTS proteins, but also to homologs thereof expressed in 
non-human species, e.g. murine, rat ami other mammalian species. 



i • a; .east at\-ul •■ and ;r.. - re umki.o at :cast a' ■ "> '• in ma:iv prelerred 
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embodiments, the sequence identity is at least about 90%, usually at least about 95% 
and more usually at least about 99% over the entire length of the protein. 



In many embodiments, the proteins of the subject invention are enzymes, 
5 particularly proteinases and more particularly a metalloproteinases. The subject 

proteins of this embodiment are characterized by having aggrecanase activity. As such, 
the subject proteins are capable of cleaving aggrecan in an interglobular domain, 
particularly between the Gl and G2 domains, and more particularly at the Ghr' ? -Ala^ 4 
bond of human aggrecan, to produce a cleavage product having an N-terminal sequence 
10 of ARGSVIL 



In addition to the proteins described above, homology or proteins (or fragments 
thereof) from other species, i.e. other animal or plant species, are also provided, where 
such homologs or proteins maybe lrom a variety ot different types of species, usually 

1 3 mammals, e rodents, such as mice, rats; domestic animals, e.ty horse, cow, do^, cat; 
and humans. By homolog is meant a protein having at least about 33 %, usually at least 
about 40% and more usually at least about 60 % amino acid sequence identity with one 
of the specific human MHTS proteins as identified above (i.e. with a protein having the 
ammo acid .sequence of SKQ ID NOS:()l, i.>3, 05 or 07), where sequence identity is 

20 determined as described supra. 



The proteins of the subject invention arc present m a non-naturally occurring 
environment, e.g. they are separated lrom their naturally occurring environment. In 
certain embodiments, the subject proteins are pre.sent in a composition that is enriched 

25 lor the subject protein as compared to its naturally occurring environment. For 

example, purified protein is provided, where by purified is meant that the protein is 
present in a composition that is substantially free of non-MPTS proteins, where by 
substantially free is meant that less than 90 %, usually less than 60 % and more usually 
less than 50 % ot the composition is made up of non- MPTS proteins. The proteins ot 

30 the subject invention may also be present as an isolate, by which is meant that the 
protein is substantially free of other proteins and other naturally occurring biologic 
molecules, such as oligosaccharides, polynucleotides and fragments thereof, and the 
like, where the term "substantially free" in this instance means that less than 70 %, 
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usually less than 60% and more usually less than 50 % of the composition containing 
the isolated protein is some other naturally occurring biological molecule. In certain 
embodiments, the proteins are present in substantially pure form, where by 
"substantially pure form" is meant at least 95%, usually at least 97% and more usually at 
5 least 99% pure. 

In addition to the naturally occurring proteins, polypeptides which vary from 
the naturally occurring proteins are also provided, e.g. MPTS polypeptides. By MPTS 
polypeptide is meant an amino acid sequence encoded by an open reading frame (ORF) 

10 of the gene encoding the MPTS, described in greater detail below, including the full 
length protein and fragments thereof, particularly biologically active fragments and/or 
fragments corresponding to functional domains, e.g. protease domain, 
thrombospondin domain, and the like; and including fusions of the subject 
polypeptides to other proteins or parts thereof. Fragments of interest will typically be at 

1 5 least about 10 aa in length, usually at least about 50 aa in length, and may be as long as 
300 aa in length or longer, but will usually not exceed about 1000 aa in length, where the 
fragment will have a stretch of amino acids that is identical to the subject protein of at 
least about 10 aa, and usually at least about 1 5 aa, and in many embodiments at least 
about 50 aa in length. Where the fragment is an MPTS- 15 fragment, it preferably 

20 includes at ieast a substantial portion of the protease domain of the wild type protein, 
where by substantial amount is at least 50 ( J-o, usually at least 60 % and more usually at 
least 70 ll o of the sequence of this domain ot the MP TS- 15 protein. For example, the 
MPTS - ) 5 fr agment generally includes a sequent which . upon alignment with the 
sequence ol residues trom the protease domain ot the wild type sequence, shows an 

."7 ; identity with the aligned region ot the wild type sequence ot this domain ot at least 

about SO".,), usually at least about 00% and more usually at least about 70%, wherein in 
many embodiments the percent identity may be much higher, e.g. 7 5, 80, X5, 90 or 95% 
or higher, e.g. 99%. 



. .•.rti.agc a :id the like, "i ne Mibject protein:, mav aS.- be derived :n»m svntheuc means. 
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e.g. by expressing a recombinant gene encoding protein of interest in a suitable host, as 
described in greater detail below. Any convenient protein purification procedures may 
be employed, where suitable protein purification methodologies are described in Guide 
to Protein Purification, (Deuthser ed.) (Academic Press, 1990). For example, a lysate 
5 may prepared from the original source, e.g. chondrocytes or the expression host, and 
purified using HPLC, exclusion chromatography, gel electrophoresis, affinity 
chromatography, and the like. 

Also provided are nucleic acid compositions encoding MPTS proteins or 
10 fragments thereof, as well as the MPTS homologues of the present invention. By nucleic 
acid composition is meant a composition comprising a sequence ot DKA having an 
open reading frame that encodes an MPTS polypeptide of the subject invention, i.e. an 
mprs gene, and is capable, under appropriate conditions, of being expressed as MPTS. 
Also encompassed in this term are nucleic acids that are homologous or substantially 
13 similar or identical to the nucleic acids encoding MPTS proteins. Thus, the subject 

invention provides genes encoding the human MPTS proteins of the subject invention 
and homologs thereof. The human MPTS 13 gene is shown in Fig. 1A, where the 
sequence shown in Fig. 1 A is identified as SFQ ID NO:02, infra. The human MPTS10 
gene is shown in Fig. 2A, where the sequence shown in Fig. 2A is identified as SFQ ID 
20 NO:04, infra. The human MPTS 19 gene is shown in Fig. 3A, where the sequence shown 
in Fig. 3 A is identified as SFQ ID NO:()6, infra. The human MPTS20 gene is shown in 
Fig. 4A, where the sequence shown in Fig. 4A is identified as SFQ ID NOUS, infra. 

The .source o* homologous genes may be any species, e.g., primate species, 
23 particularly human, rodents, such as rats and mice, canines, telmes, bovines, ovines, 
equities, yeast, nematodes, etc. between mammalian species, e.g., human and mouse, 
homologs have substantial sequence similarity, e.g. at least 75% sequence identity, 
usually at least 90'Ni, more usually at least 95% between nucleotide sequences. Sequence 
similarity is calculated based on a reference sequence, which may be a subset of a larger 
30 sequence, such as a conserved motif, coding region, flanking region, etc. A reference 

sequence will usuallv be at least about 18 nt long, more usually at least about 30 nt long, 
and may extend to the complete sequence that is being compared. Algorithms for 
sequence analysis are known in the art, such as BLAST, described in Altschul et al. 
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(1990),/. Mol. Biol. 215:403-10 (using default settings, i.e. parameters w=4 and 7 = 17). 
The sequences provided herein are essential for recognizing MPTS-, including 
aggrecanase-, related and homologous proteins, and the nucleic acids encoding the 
same, in database searches. Of particular interest in certain embodiments are nucleic 
5 acids of substantially the same length as the nucleic acids identified as SEQ ID NO:02, 
04, 06 and 08 and have sequence identity to one of these sequences of at least about 
90%, usually at least about 95% and more usually at least about 99% over the entire 
length of the nucleic acid. 

10 Nucleic acids encoding the proteins and polypeptides of the subject invention 

may be cDNA or genomic DNA or a fragment thereof. The term "MPTS gene" shall be 
intended to mean the open reading frame encoding specific MPTS proteins and 
polypeptides, and introns, as well as adjacent 5' and 3' non-coding nucleotide 
sequences involved in the regulation of expression, up to about 20 lb beyond the coding 

15 region, but possibly further in either direction. The gene may be introduced into an 
appropriate vector for extrachromosomal maintenance or for integration into a host 
genome. 

The term "cDNA" as used herein is intended to include all nucleic acids that 
21) share the arrangement of sequence elements found in native mature mRNA species, 
where sequence elements are exons and 5' and 3' non-coding regions. Normally 
mKNA species have contiguous exonv with the intervening intnins, when present, 
being removed by nuclear UNA splicing, to create a continuous open reading trame 
encoding .m MPTS protein. 

A genomic sequence of interest comprises the nucleic acid present between the 
initiation codon and the stop codon, as defined in the listed sequences, including ail of 
the introns that are nurnully present in a native chromosome. It may further include 5' 
: V ,,-*.,.,.!>*.,!,..,: .. f, ..,,,,] ,\, ,p ,.,,,-....,!< v; \ Co...;. ' . i.. 
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fragment of 100 kbp or smaller; and substantially free of flanking chromosomal 
sequence. The genomic DNA Hanking the coding region, either 3' or 5', or internal 
regulatory sequences as sometimes found in introns, contains sequences required for 
proper tissue and stage specific expression. 

5 

The nucleic acid compositions of the subject invention may encode all or a part 
of the subject MPTS protein. Double or single stranded fragments may be obtained 
from the DNA sequence by chemically synthesizing oligonucleotides in accordance with 
conventional methods, by restriction enzyme digestion, by PGR amplification, etc. For 
10 the most part, DNA fragments will be of at least 15 nt, usually at least 18 nt or 25 nt, and 
may be at least about 50 nt. 

The subject genes are isolated and obtained in substantial purity, generally as 
other than an intact chromosome. Usually, the DNA will be obtained substantially free 
1 5 of other nucleic acid sequences that do not include an MPTS gene sequence or fragment 
thereof, generally being at least about 50%, usually at least about 90% pure and are 
typically "recombinant", i.e. flanked by one or more nucleotides with which it is not 
normally associated on a natinallv occurring chromosome. 

20 In addition to the plurality of uses described in greater detail in following 

sections, the subiect nucleic acid compositions find use in the preparation of all or a 
portion of the MPTS polypeptides, as described above. The provided polynucleotide 
U\^., a polynucleotide having a sequence of SliQ ID NO:02, 04, 06 or 08), the 
corresponding cDNA, or the full-length gene is used to express a partial or complete 

25 gene product. Constructs ot polynucleotides having a sequences of SEQ ID NOs: 02, 04, 
06 or 08 can be generated synthetically. Alternatively, single-step assembly of a gene 
and entire plasmid from large numbers of oligodeoxyribonucleotides is described by, 
e.g., Stemmer ct al, Gene (A??isterrfam) ( 1995) 164(1 J;49-33. In this method, assembly 
PGR (the synthesis of long DNA sequences from large numbers of oligodeoxy- 

30 ribonucleotides (oligosl) is described. The method is derived from DNA shuffling 
(Stemmer, Nature (1994) 370:389-391 ), and does not rely on DNA ligase, but instead 
relies on DNA polymerase to build increasingly longer DNA fragments during the 
assembly process. Appropriate polynucleotide constructs are purified using standard 
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recombinant UNA techniques as described in, for example, Sambrook et ai, Molecular 
Cloning: A Laboratory Manual, 2nd Ed., (1989) Cold Spring Harbor Press, Cold Spring 
Harbor, NY, and under current regulations described in United States Dept. ot HHS, 
National Institute of Health (NIH) Guidelines for Recombinant DNA Research. 

s 

Polynucleotide molecules comprising a polynucleotide sequence provided herein 
are propagated by placing the molecule in a vector. Viral and non-viral vectors are 
used, including plasmids. The choice of plasmid will depend on the type of cell in which 
propagation is desired and the purpose of propagation. Certain vectors are useful for 

10 amplifying and making large amounts of the desired DNA sequence. Other vectors are 
suitable for expression in cells in culture. Still other vectors are suitable for transfer and 
expression in cells in a whole animal or person, The choice of appropriate vector is w r ell 
within the skill of the art. Many such vectors are available commercially. The partial or 
full-length polynucleotide is inserted into a vector typically by means of DNA ligasc 

15 attachment to a cleaved restriction enzyme site in the vector. Alternatively, the desired 
nucleotide sequence can be inserted by homologous recombination in vivo. Typically 
this is accomplished by attaching regions of homology to the vector on the flanks or the 
desired nucleotide sequence. Regions ot homology are added by ligation of 
oligonucleotides, or by polymera.se chain reaction using primers comprising both the 

20 region of homology and a portion of the desired nucleotide sequence, tor example. 

for expression, an expression cassette or system may be employed, 'the gene 
product encoded by a polynucleotide ot the invention is expressed in any convenient 
expression system, iiK'huling, tor example, bacterial, yeast, insect, amphibian and 
23 mammalian systems. Suitable vectors and host cells are described m L.S. Patent No. 
3,654,173. In the expression vector, an MPTS encoding polynucleotide, e.g. as set forth 
in SFQ 11) NO: 02, 04, 06 or OS, is linked to a regulatory sequence as appropriate to 
obtain the desired expression properties. '1 hese can include promoters (attached either 
at the 5' end of the sense strand or .it the 3 end or the antisense strand >, enhancers. 
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linkage to vectors. Any techniques known in the art can be used. In other words, the 
expression vector will provide a transcriptional and translational initiation region, 
which may be inducible or constitutive, where the coding region is operably linked 
under the transcriptional control of the transcriptional initiation region, and a 
transcriptional and translational termination region. These control regions maybe 
native to the subject MPTS gene, or may be derived from exogenous sources. 

Expression vectors generally have convenient restriction sites located near the 
promoter sequence to provide for the insertion of nucleic acid sequences encoding 
heterologous proteins. A selectable marker operative in the expression host may be 
present. Expression vectors may be used for the production of fusion proteins, where 
the exogenous fusion peptide provides additional functionality, i.e. increased protein 
synthesis, stability, reactivity with defined antisera, an enzyme marker, e.g. 
[i-galactosidase, etc. 

Expression cassettes may be prepared comprising a transcription initiation 
region, the gene or fragment thereof, and a transcriptional termination region. Of 
particular interest is the use of sequences that allow tor the expression of functional 
epitopes or domains, usually at least about 8 amino acids in length, more usually at least 
about 15 amino acids :n length, to about 25 amino acids, and up to the complete open 
reading frame of the gene. After introduction of the DN'A, the cells containing the 
construct maybe selected by means of a selectable marker, the cells expanded and then 
used for expression. 

The MPTS proteins and polypeptides may be expressed in prokaryotes or 
eukaryotes in accordance with conventional ways, depending upon the purpose for 
expression. For large scale production of the protein, a unicellular organism, such as E. 
coli, B. subtilis, S. ccrevisinc, insect cells in combination with baculovirus vectors, or cells 
of a higher organism such as vertebrates, particularly mammals, eg COS 7 cells, HEK 
293, CHO, Xenopus Oocytes, etc., may be used as the expression host cells. In some 
situations, it is desirable to express the gene in eukaryotic cells, where the expressed 
protein will benefit from native folding and post-translational modifications. Small 
peptides can also be synthesized in the laboratory. Polypeptides that are subsets of the 
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complete protein .sequence may be used to identify and investigate parts of the protein 
important for function. 

Specific expression systems of interest include bacterial, yeast, insect cell and 
5 mammalian cell derived expression systems. Representative systems from each of these 
categories is are provided below 

Bacteria. Expression systems in bacteria include those described in Chang et al, 
Nature (1978) 275:61 5; Goeddel et al, Nature (1979) 281:544; Goeddel et al, Nucleic 

10 Acids fe. (1980) 8:4057; EP 0 036,776; U.S. Patent No. 4,551,433; Deboer et al, Proc. 
Natl. Acad. Sa. (USA) (1983) 80:21-25; and Siebenlist et al, Cell (1980) 20:269. 

Yeast, Expression systems in veast include those described in Hinnen et al, Proc. 
Natl Acad. Sci. (USA) (1978) 75 192 L '; [to et ai, }. Bacterial (1983) 153:163; Kurtz et al, 
Mol Cell Biol. (1986) 6 142; Kunze et al, J. Basic Microbiol. (1985) 25:141; Gleeson t-f 

] 5 al, I Gcu. Microbiol. (1986) 152 3459; Roggenkamp er id., Mol Gen. Genet. (1986) 
202:302; Das et al, J. Bacterial ( 1984 ) 15S: \ 165; De Louvencourt etal, J. Bacterial 
(1983) J 54:737; Van den Berg et al, Bio/Technology (1990) 8:135; Kunze et ai, }. Basic 
Microbiol (1985) 25: 14 I ; Gregg et al. Mol Cell Biol (1985) 5:3376; U.S. Patent Nos. 
4,837,148 and 4,929,553; Beach and Kurse, Nature ( 1981) 500:706; Davidow et al, Curr 

20 Genet. { 1985) /0:380; Gaillardin et al , Curr. ( icnet. ( 1985) 70:49; Ballar.ce et al, 
Biocheui. Biophys. Res. Commun. \ 1983) 1 12:284-289; Tilburn et al, ( iene ( 1983 ) 
26:203-221; Yelton et al, Proc. Natl Acad. Sa. (USA) ■ 1984) 81:1470-1474; Kelly and 
Hyr.es, PMBO j. { 1983) 4:475479: Id 0 244,234; and WO 9|/U)337. 

Ijisect Geljv Expression ot lu ten >S^< ui s t'.cnes in insects is accomplished 

23 described in U.S. Patent No. 4,743,05 1 ; 1-riesen e; ,,•<., " 1'he Regulation or Uaailov.ru* 
Gene Expression", in: 7 iic Molecuuir Biology ( >/ Baculoviru>e> 1 1986) f W Doertler, ed. r, 
EP 0 127,839; EP 0 155, 176; and Vial et al., J. Gai. Virol { 1988) 69:765-776; Miller t'f al. 
Ann. Rev. Microbiol (1988) 42:177; Gnrbonell et al, Gene (1988) 75:409; Maeda et al, 
Nature (1985) 3/5:592-594; I.ebacq- Vcrhcyden ^ Mol Cell. Biol (1988> 8:3129; 
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n/., Bio/Technology (1988) 6:47-55, Miller et aL, Generic Engineering (1986) 5:277-279, 
and Maeda et aL, Nature { 1 985 J 375:592-594. 

Mammalian Cells. Mammalian expression is accomplished as described in 
Dijkema et aL, EMBO J. (1985) 4:761, Gorman et aL, Proc. Natl Acad. Sci. (USA) ( 1982) 
79:6777, Boshart et aL, Celt (1985) 47:521 and U.S. Patent No. 4,399,216. Other features 
of mammalian expression are facilitated as described in Ham and Wallace, Meth. Em. 
(1979) 5<S:44, Barnes and Sato, Anal Bwchem. (1980) 702:255, U.S. Patent Nos. 
4,767,704, 4,657,866, 4,927,762, 4,560,655, WO 90/103430, WO 87/00195, and U.S. RF 
30,985. 

When any of the above host cells, or other appropriate host cells or organisms, 
are used to replicate and/or express the polynucleotides or nucleic acids of the 
invention, the resulting replicated nucleic acid, RNA, expressed protein or polypeptide, 
is within the scope of the invention as a product of the host cell or organism. The 
product is recovered by any appropriate means known in the art. 

Once the gene corresponding to a selected polynucleotide is identified, its 
expression can be regulated in the cell to which the gene is native, i-'or example, an 
endogenous gene of a cell can be regulated by an exogenous regulatory sequence as 
disclosed in U.S. Patent No. 5,641,670. 

The subject polypeptide and nucleic acid compositions find use in a variety ot 
different applications, including general applications, diagnostic applications, and 
therapeutic agent screening/discovery; preparation applications, as well as in 
therapeutic compositions and methods employing the same. 

The subject nucleic acid compositions find use in a variety of general 
applications. General applications of interest include: the identification of MPTS 
homologs; as a source of novel promoter elements; the identification of MPTS 
expression regulatory factors; as probes and primers in hybridization applications, e.g. 
PGR; the identification of expression patterns in biological specimens; the preparation 
ot cell or animal models for MPTS function; the preparation of in vitro models for 
MPTS function; etc. 
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Homologs of the subject genes are identified by any of a number of methods. A 
fragment of the provided cDNA may be used as a hybridization probe against a cDNA 
library from the target organism of interest, where low stringency conditions are used. 
5 The probe may be a large fragment, or one or more short degenerate primers. Nucleic 
acids having sequence similarity are detected by hybridization under low stringency 
conditions, for example, at 50°C and 6xSSC (0.9 M sodium chloride/0.09 M sodium 
citrate) and remain bound when subjected to washing at 55°C in IxSSC (0.15 M sodium 
chlondc/0.015 M sodium citrate). Sequence identity may be determined by 
10 hybridization under stringent conditions, for example, at 50 C 'C or higher andO.lxSSC 
( 15 mM sodium chloride/01 .5 mM sodium citrate). Nucleic acids having a region of 
substantial identity to the provided sequences, e.g. allelic variants, genetically altered 
versions of the gene, etc., bind to the provided sequences under stringent hybridization 
conditions. By using probes, particularly labeled probes of PNA sequences, one can 

1 5 isolate homologous or related genes. 

The sequence of the 5' thinking region may be utilized for promoter elements, 
including enhancer binding sites, tnat provide tor developmental regulation in tissues 
where the subject MPTS gene is expressed. The tissue specific expression is useful for 
Jo determining the pattern of expression, and tor providing promoters that mimic the 
native pattern ot expression. Naturally occurring polymorphisms in the promoter 
region are useful for determining nat.iral variations m expression, particularly those 
that ma\' be associated with disease. 

2 i Alternatively, mutations mav be introduced into the promoter region to 

determine the effect of altering expression in experimentally defined .systems. Methods 
for the identification of specific DNA motifs involved in the binding of transcriptional 
factors are known in the art, e.y. sequence similarity to known binding moths, gel 
retardation studies, etc. for examples, see Iilaekwell ct id. ( 1095 \ Mol. Med. 1:194-205; 
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The regulatory sequences may be used to identify cis acting sequences required 
for transcriptional or translational regulation of MPTS gene expression, especially in 
different tissues or stages of development, and to identify cis acting sequences and trans- 
acting factors that regulate or mediate MPTS gene expression. Such transcription or 
5 translational control regions maybe operably linked to an MPTS gene in order to 
promote expression of wild type or altered MPTS or other proteins of interest in 
cultured cells, or in embryonic, fetal or adult tissues, and for gene therapy. 

Small DNA fragments are useful as primers for PGR, hybridization screening 
] 0 probes, etc. Larger DNA fragments, j.t\ greater than 100 nt are useful for production of 
the encoded polypeptide, as described in the previous section. For use in geometric 
amplification reactions, such as geometric PGR, a pair of primers will be used. The 
exact composition of the primer sequences is not critical to the invention, but tor most 
applications the primers will hybridize to the subject sequence under stringent 
15 conditions, as known in the art. It is preferable to choose a pair of primers that will 
generate an amplification product of at least about 50 nt, preferably at least about 
100 nt. Algori'hms for the selection of primer sequences are generally known, and are 
available in commercial software packages. Amplification primers hybridize to 
complementary strands of DNA, and will prime towards each other. 

20 

The DNA may also be used to identify expression of the gene in a biological 
>pecimen. The manner in which one probes cells lor the presence ol particular 
nucleotide sequences, as genomic DNA or RNA, i> well established in the literature. 
Briefly, DNA or mRNA is isolated from a cell sample. The mKNA may be amplified by 

25 RT-PGK, using reverse transcriptase to form a complementary DNA strand, followed by 
polymerase chain reaction amplification using pinners specific for the subject DNA 
sequences. Alternatively, the mRNA sample is separated by gel electrophoresis, 
transferred to a suitable support, e.g. nitrocellulose, nylon, etc., and then probed with a 
fragment of the subject DNA as a probe. Other techniques, such as oligonucleotide 

50 ligation assays, in situ hybridizations, and hybridization to DNA probes arrayed on a 
solid chip may also find use. Detection of mRNA hybridizing to the subject sequence is 
indicative of MPTS gene expression in the sample. 
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The sequence of an MPTS gene, including flanking promoter regions and coding 
regions, may be mutated in various ways known in the art to generate targeted changes 
in promoter strength, sequence of the encoded protein, etc. The DNA sequence or 
protein product of such a mutation will usually be substantially similar to the sequences 
5 provided herein, i.e. will differ by at least one nucleotide or amino acid, respectively, 
and may differ by at least two but not more than about ten nucleotides or amino acids. 
The sequence changes may be substitutions, insertions, deletions, or a combination 
thereof. Deletions may further include larger changes, such as deletions of a domain or 
exun. Other modifications of interest include epitope tagging, e.g. with the FLAG 
10 system, HA, etc. For studies of subcellular localization, fusion proteins with green 
fluorescent proteins (GFP) maybe used. 

Techniques for in vitro mutagenesis of cloned genes are known. Examples of 
protocols for site specific mutagenesis may be found in Gustm ct al ( 1993), 

15 bioteclinujites 14:22; Barany (1985), Gene 37: 1 1 1-23. Colicelli et al (I9H5), Mol Gen. 
Genet. 199:537-9; and Prentki et al. (1984), Gene 29.303-13. Methods for site specific 
mutagenesis can be found in Sambrook et al, Molecular Cloning: A l aboratory Manual, 
CSH Press 1989, pp. 1 5.3- 1 5. 108; Werner et al f 1993), Gene 1 26:35-4 1 ; Savers ct at. 
i 1992 ), Biotechniques 13:592-6; Jones and Wmistorfer ( 1992 ), Hiotechnujues 12:528-30; 

2d barton et al (1990), Nucleic Acids Res 18:7349-55; Marotti and Tomich (1989), Gene 
Anal Tech. 6:67-70; and Zhu (1989), Anal Hiochcm 177:120-4. Such mutated genes may 
be u.sed to study structure- function relationships of an MP ' S protein, or to alter 
properties of the protein that arrest its function or regulation. 

j ^ The subject nucleic acid^ can be used to generate t raosgcr.K , i:on human 

animals or site specific gene modifications in cell lines. Transgenic animals may be 
made through homologous recombination, where the endogenous locus is altered. 
Alternatively, a nucleic acid construct is randomly integrated into the genome. Vectors 
for stable integration include plasmids, retroviruses and other animal viruses, YACs, 



regulation, i H interest the u^e ot the •»ubiect gene-- to const; ik: transgenic am ma 
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models ofMPTS related disease conditions, including aggrecanase related disease 
conditions, e.g. disease conditions associated with aggrecanase activity, such as arthritis. 
Thus, transgenic animal models of the subject invention include endogenous MPTS 
gene knockouts in which expression of endogenous MPTS is at least reduced if not 
5 eliminated, where such animals also typically express an MPTS peptide of the subject 
invention, e.g. the specific MPTS proteins of the subject invention or a fragment 
thereof. Where a nucleic acid having a sequence found in the human MPTS gene is 
introduced, the introduced nucleic acid may be either a complete or partial sequence of 
the MPTS gene. A detectable marker, such as lac Z may be introduced into the MPTS 
10 locus, where upregulation of gene expression will result in an easily detected change in 
phenotype. One may also provide for expression of the gene or variants thereof in cells 
or tissues where it is not normally expressed, at levels not normally present in such cells 
or tissues. 

1 5 DNA constructs for homologous recombination will comprise at least a portion 

of the an MPTS gene of the subject invention, wherein the gene has the desired genetic 
modification's J, and includes regions of homologv to the target locus. DNA constructs 
tor random integration need not include regions nt homology to mediate 
recombination. Conveniently, markers tor positive and negative selection are included. 

20 Methods lor generating cells having targeted gene modifications through homologous 
recombination are known in the art. For various techniques tor transfectmg 
mammalian cells, see Keown et at. ( 199UJ, Mctii. Enzymol. 185:527-537. 

I or embryonic stem (PS) cells, an ES cell line may be employed, or embryonic 
23 cells may be obtained freshly from a host, e.g. mouse, rat, guinea pig, ere. Such cells are 
grown on an appropriate ftbroblasCfeeder layer or grown in the presence of leukemia 
inhibiting factor (l.IH). When ES or embryonic cells have been transformed, they may 
be used to produce transgenic animals. After transformation, the cells are plated onto a 
feeder layer in an appropriate medium. Cells containing the construct may be detected 
30 by employing a selective medium. After sufficient time for colonies to grow, they are 
picked and analyzed tor the occurrence of homologous recombination or integration of 
the construct. Those colonies that are positive may then be used for embryo 
manipulation and blastocyst injection. Blastocysts are obtained from 4 to 6 week old 
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superovulated females. The ES cells are trypsinized, and the modified cells are injected 
into the blastocoel of the blastocyst. After injection, the blastocysts are returned to each 
uterine horn of pseudopregnant females. Females are then allowed to go to term and 
the resulting offspring screened for the construct. By providing for a different 
5 phenotype of the blastocyst and the genetically modified cells, chimeric progeny can be 
readily detected. 

The chimeric animals are screened for the presence of the modified gene and 
males and females having the modification are mated to produce homozygous progeny. 
10 If the gene alterations cause lethality at some point in development, tissues or organs 
can be maintained as allogeneic or congenic grafts or transplants, or in in vitro culture. 
The transgenic animals may be any non-human mammal, such as laboratory animals, 
domestic animals, etc. The transgenic animals may be used in functional studies, drug 
screening, etc., e.g. to determine the effect of a candidate drug on aggrecanase activity. 

Also provided are methods ot diagnosing disease states based on observed levels 
of an MPTS protein or the expression level of the gene in a biological sample of interest. 
Samples, as used herein, include biological fluids .such as blood, cerebrospinal fluid, 
tears, saliva, lymph, dialysis fluid, and the like; organ or tissue culture derived fluids; 
2u and fluids extracted from physiological tissues. Also included in the term are derivatives 
and fractions o! such thuds. The cells may be dissociated, in the case ot solid tissues, or 
tissue sections mav be analv/ed. Alternatively a tvsate of the cells mav be prepared. 

A number ot methods are available fur determining tn.e expression levc. of ,i 
> gene or protein in a particular sample, diagnosis mav be performed bv a number ot 
methods to determine the absence or presence or altered amounts ot normal or 
abnormal MPTS in a patient sample. For example, detection may utilize staining ol cells 
or histological sections with labeled antibodies, performed in accordance with 
conventional methods. Cells are permeabili/.ed to stain cytoplasmic molecules. The 



other labels lor direct vie lection. Alternatively, a second stage antibody or reagent 'o 
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used to amplify the signal. Such reagents are well known in the art. For example, the 
primary antibody may be conjugated to biotin, with horseradish pcroxidase-conjugated 
avidin added as a second stage reagent. Alternatively, the secondary antibody 
conjugated to a fluorescent compound, e.g. fluorescein, rhodamine, Texas red, etc. Final 
detection uses a substrate that undergoes a color change in the presence of the 
peroxidase. The absence or presence of antibody binding may be determined by various 
methods, including flow cytometry of dissociated cells, microscopy, radiography, 
scintillation counting, etc. 

Alternatively, one may focus on the expression of the MPTS gene. Biochemical 
studies may be performed to determine whether a sequence polymorphism in an MPTS 
coding region or control regions is associated with disease. Disease associated 
polymorphisms may include deletion or truncation of the gene, mutations that alter 
expression level, that affect the activity of the protein, etc. 

Changes in the promoter or enhancer sequence that may affect expression levels 
ot MPTS can be compared to expression levels ot the normal allele by various methods 
known in the art. Methods for determining promoter or enhancer strength include 
quantitation of the expressed natural protein; insertion of the variant control element 
into a vector with a reporter gene such as (3-galactosidase> luciferase, chloramphenicol 
acetyltransferase, etc. that provides for convenient quantitation; and the like. 

A number of methods are available tor analyzing nucleic acids for the presence 
of a specific sequence, e.g. a disease associated polymorphism. Where large amounts of 
DNA are available, genomic DNA is used directly. Alternatively, the region of interest is 
cloned into a suitable vector and grown in sufficient quantity for analysis. Cells that 
express an MPTS protein may be used as a source of mRNA, which may be assayed 
directly or reverse transcribed into cDNA for analysis. The nucleic acid may be 
amplified by conventional techniques, such as the polymerase chain reaction (PCR), to 
provide sufficient amounts for analysis. The use of the polymerase chain reaction is 
described in Saiki, et al (1985), Science 239:487, and a review of techniques maybe 
found in Sambrook, et al Molecular Cloning: A Laboratory Manual , CSH Press 1989, 
pp. 14.2-14.33. Alternatively, various methods are known in the art that utilize 



CA 02332533 2001-02-16 



21 

oligonucleotide ligation as a means of detecting polymorphisms* for examples see Riley 
etal ( 1990), N»c/. Acids Res. 18:2887-2890; and Delahunty et al. (1996), Am. /. Hum. 
Genet. 58:1239-1246. 

5 A detectable label may be included in an amplification reaction. Suitable labels 

include fluorochromes, e.g. fluorescein isothiocyanate (FITC), rhodamine, Texas Red, 
phycoerythrin, allophycocyanin, 6-carboxyfluorescein (6-FAM), 2\7'-dimethoxy-4\5'- 
dichloro-6-carboxyfluorescein (JOE), 6-carboxy-X-rhodamine (ROX), 6-carboxy- 
2',4',7',4,7-hexachlorofluorescein (HEX), 5-carboxyhTiorescein (5-FAM) or N,N,N\N P - 

10 tetramethyl-6-carboxyrhodamine (TAMRA), radioactive labels, e.g. 3 "P, 3j S, ~'H; etc. The 
label may be a two stage system, where the amplified DNA is conjugated to biotin, 
haptens, etc. having a high affinity binding partner, e.g. avidin, specific antibodies, etc., 
where the binding partner is conjugated to a detectable label. The label may be 
conjugated to one or both of the primers. Alternatively, the pool of nucleotides used in 

1 5 the amplification is labeled, so as to incorporate the label into the amplification 
product. 

The sample nucleic acid, e.g. amplified or cloned fragment, is analyzed by one of 
a number of methods known in the art. The nucleic acid may be sequenced by dideoxy 

20 or other methods, and the sequence of bases compared to a wild-type gene sequence. 
Hybridization with the variant sequence may also be used to determine its presence, by 
Southern blots, dot blots, etc. The hybridization pattern of a control and variant 
sequence to an array tit oligonucleotide probes immobilized on a solid support, as 
described in US 3,443,^3-1, or in WO L >3-'333 : .''3, mav also be used as a means ot detecting 

2" the presence of win ant sequences. Single strand C' -r. f wmational polymorphism : SSfP 
analysis, denaturing gradient gel electrophoresis ( I Jti(iE.i, and heteroduplex. analy.sis m 
gel matrices are used to detect conformational changes created by DNA sequence 
variation as alterations in electrophoretic mobility. Alternatively, where a 
polymorphism creates or destroys a recognition sire tor a restriction endonuclease, the 
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Screening for mutations in MPTS may be based on the functional or antigenic 
characteristics of the protein. Protein truncation assays are useful in detecting deletions 
that may affect the biological activity of the protein. Various immunoassays designed to 
detect polymorphisms in MPTS proteins may be used in screening. Where many 
5 diverse genetic mutations lead to a particular disease phenotype, functional protein 
assays have proven to be effective screening tools. The activity of the encoded MPTS 
protein may be determined by comparison with the wild-type protein. 

Diagnostic methods of the subject invention in which the level of MPTS gene 
10 expression is of interest will typically involve comparison of the MPTS nucleic acid 

abundance of a sample of interest with that of a control value to determine any relative 
differences, where the difference may be measured qualitatively and/or quantitatively, 
which differences are then related to the presence or absence of an abnormal MPTS 
gene expression pattern. A variety of different methods tor determine the nucleic acid 
1 5 abundance in a sample are known to those of skill in the art, where particular methods 
of interest include those described in: Pietu et al.. Genome Res. (June 1996) 6: 492-503; 
Zhao et al., Gene (April 24, 1995) 156: 207-213; Sonres , Curr. Opin. Biotechnol. 
(October 1997) 8: 542-546; Raval, 1. Pharmacol Toxicol Methods (November 1994.) 32: 
125-127; Chalifour et al., Anal. Bioehem (bebruarv 1, 1994) 216: 299-304; Stolz & Tuan, 
2d Mol. Biotechnol. (December 19960 6: 225-250; Honget al., Bioscience Reports ( 1982) 2: 
907; and McGraw, Anal. Biochem. ; 1984) 143: 298. Also of interest are the methods 
disclosed in WO 97/27317, the disclosure of which is herein incorporated by reference. 

The subject polypeptides find use in various screening assays designed to 
25 identify therapeutic agents. In vitro screening assays can be employed in which the 

activity of an MPTS polypeptide, e.g. the aggrecanase activity of an MPTS polypeptide, 
is assessed in the presence of a candidate therapeutic agent and compared to a control, 
i.e. the activity in the absence of the candidate therapeutic agent. Activity can be 
determined in a number of different ways, where activity may generally be determined 
50 as ability to cleave aggrecan or at least a fragment therefore, as well as a recombinant 
polypeptide, that includes the aggrecanase cleavage site, as described above. Such assays 
are described in U.S. Patent No. 5,872,209 and WO 99/05921 as well as Arner et al, J. 
Biol. Chem. (March 1999) 274: 6594-6601. 



GA 02332533 20O1-02-1S 



23 

Also of interest in screening assays are non-human transgenic animals that 
express functional MPTS, where such animals are described above. In many 
embodiments, the animals lack the corresponding endogenous MPTS. In using such 
animals for screening applications, a test compound(s) is administered to the animal, 
and the resultant changes in phenotype, e.g. presence of aggrecan produced by cleavage 
of the Glu 3 '"*-Ala 374 bond, are compared with a control. 

Alternatively, in vitro models of MPTS binding activity may be measured in 
which binding events between MPTS and candidate MPTS modulatory agents are 
monitored. 

A variety of other reagents may be included in the screening assays, depending 
on the particular screening protocols employed. These include reagents like salts, 
neutral proteins, e.g. albumin, detergents, etc that are used to facilitate optimal protein- 
protein binding and/or reduce non-specific or background interactions. Reagents that 
improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors, anti- 
microbial agents, etc. may be used. 

A variety of different candidate therapeutic agents that serve as either MPTS 
agonists or antagonists may be screened by the above methods. Candidate agents 
encompass numerous chemical classes, though typically they are organic molecules, 
preferably small organic compounds having a molecular weight of more than N ) and 
iess than about 2,S(a) daltoiis. Candidate agents comprise Knxtional groups necessarv 
l> :: structural interaction wi:h proteins, particularly hydrogen boud.ing, and typically 
include at least an amine, carbonyl, hydroxy! or carboxvl group, preferablv at least two 
of the functional chemical groups. The candidate agents often comprise cyclical carbon 
or heterocyclic structures and/or aromatic or polyaromatic structures substituted with 
one or more of tire above functional groups. Candidate agents are also found among 
tuomolecules including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, 



: -vr/.y.etiv or natural compounds, roi example. numco'Us means are available for 
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random and directed synthesis of a wide variety of organic compounds and 
biomolecules, including expression of randomized oligonucleotides and oligopeptides. 
Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and 
animal extracts are available or readily produced. Additionally, natural or synthetically 
5 produced libraries and compounds are readily modified through conventional 

chemical, physical and biochemical means, and may be used to produce combinatorial 
libraries. Known pharmacological agents may be subjected to directed or random 
chemical modifications, such as acylation, alkylation, esterification, amidification, etc. 
to produce structural analogs. 

10 

Of particular interest in many embodiments are screening methods that identify 
agents that selectively modulate, e.g. inhibit, the subject MPTS enzyme and not other 
proteases. 

The nucleic acid compositions of the subject invention also find use as 
therapeutic agents in situations where one wishes to enhance an MPTS activity in a host. 
The MPTS genes, gene Iragments, or the encoded proteins or protein fragments are 
useful in gene therapy to treat disorders associated with MPTS detects, including 
aggrecanase detects. l.xpression vectors may be used to introduce the gene into a cell- 
Such vectors generally have convenient restriction sites located near the promoter 
sequence to provide for the insertion 01 nucleic acid sequences. Transcription cassettes 
may be prepared comprising a transcription initiation region, the target gene or 
fragment thereof, and a transcriptional termination region. The transcription cassettes 
may be introduced into a variety of vectors, e.g. plasmid; retrovirus, l*.^. lentivinis; 
adenovirus; and the like, where the vectors are able to transiently or stably be 
maintained in the cells, usually for a period of at least about one day, more usually tor a 
period of at least about several days to several weeks. 

The gene or protein may be introduced into tissues or host cells by any number 
30 of routes, including viral infection, microinjection, or fusion of vesicles. Jet injection 
may also be used for intramuscular administration, as described by Furth ct al (1992), 
Anal Btochem 205:365-368. The DNA may be coated onto gold microparticles, and 
delivered intradermally by a particle bombardment device, or "gene gun" as described in 



13 
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the literature (see, for example, Tang et al (1992), Nature 356:152-154), where gold 
microprojectiles are coated with the DNA, then bombarded into skin cells. 

The subject invention provides methods of modulating MPTS, and in many 
5 embodiments aggrecanase, activity in a cell, including methods of increasing MFTS 
activity (e.g. methods of enhancing ), as well as methods of reducing or inhibiting 
MFTS activity, e.g. methods of stopping or limiting aggrecan cleavage. In such methods, 
an effective amount of a modulatory agent is contacted with the cell. 

10 Also provided are methods of modulating, including enhancing and inhibiting, 

MPTS activity in a host. In such methods, an effective amount of active agent that 
modulates the activity of an MFTS protein in vivo, e.g. where the agent usually enhances 
or inhibits the target MFTS activity, is administered to the host. The active agent may be 
a variety of different compounds, including a naturally occurring or synthetic small 

13 molecule compound, an antibody, fragment or derivative thereof, an antisense 
composition, and the like. 

Ot particular interest in certain embodiments are agents that reduce MPTS 
activity, including agents that reduce aggrecanase activity, e.g. aggrecan cleavage, by at 
20' least about 10 fold, usually at least about 20 fold and more usually at least about 25 fold, 
as measure by the Assay described in Arner et al. ( 1999), suprn. In many embodiments, 
ot particular interest is the use of compounds that reduce .iggrecanase activity by at least 
iu;i told, as compared to a control. 

2'> Also of interest is the use ot agents that, while providing tor reduced MPTS, 

including aggrecanase, activity, do not substantially reduce the activity of other 
proteinases, if at all. Thus, the agents in this embodiment are selective inhibitors of 
MFTS. An agent is considered to be selective if it provides tor the above reduced 
aggrecanase activity, but substantially no reduced activity of at least one other 
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Naturally occurring or synthetic small molecule compounds of interest include 
numerous chemical classes, though typically they are organic molecules, preferably 
small organic compounds having a molecular weight of more than 50 and less than 
about 2,500 daltons. Candidate agents comprise functional groups necessary for 
structural interaction with proteins, particularly hydrogen bonding, and typically 
include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two 
of the functional chemical groups. The candidate agents often comprise cyclical carbon 
or heterocyclic structures and/or aromatic or polyaromatic structures substituted with 
one or more of the above functional groups. Candidate agents are also found among 
biomolecules including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, 
derivatives, structural analogs or combinations thereof. 

Also of interest as active agents are antibodies that at least reduce, if not inhibit, 
the target MPTS, e.g. aggrecanase, activity in the host. Suitable antibodies are obtained 
by immunizing a host animal with peptides comprising all or a portion of the target 
protein, e.g. MPTS- 1 5, MPTS- 19 or MPTS-20. Suitable host animals include mouse, rat 
sheep, goat, hamster, rabbit, etc. The origin of the protein immunogen may be mouse, 
human, rat, monkey etc. The host animal will generallv be a different species than the 
immunogen, t*.y. human MPTS used to immunize mice, etc. 

The immunogen may comprise the complete protein, or fragments and 
derivatives thereof. Preferred immunogens comprise all or a part of MPTS, where these 
residues contain the post-translation modifications, such as glycosylation, found on the 
native target protein. Immunogens comprising the extracellular domain are produced 
in a variety of ways known in the art, t\y. expression of cloned genes using conventional 
recombinant methods, isolation from HKC, etc. 

For preparation of polyclonal antibodies, the first step is immunization of the 
host animal with the target protein, where the target protein will preferably be in 
substantially pure form, comprising less than about 1% contaminant. The immunogen 
may comprise the complete target protein, fragments or derivatives thereof. To increase 
the immune response of the host animal, the target protein maybe combined with an 
adjuvant, where suitable adjuvants include alum, dextran, sulfate, large polymeric 
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anions, oil & water emulsions, e.g. Freund's adjuvant, Freund's complete adjuvant, and 
the like. The target protein may also be conjugated to synthetic carrier proteins or 
synthetic antigens. A variety of hosts may be immunized to produce the polyclonal 
antibodies. Such hosts include rabbits, guinea pigs, rodents, e.g. mice, rats, sheep, goats, 
5 and the like. The target protein is administered to the host, usually intradermally, with 
an initial dosage followed by one or more, usually at least two, additional booster 
dosages. Following immunization, the blood from the host will be collected, followed by 
separation of the serum from the blood cells. The Ig present in the resultant antiserum 
may be further fractionated using known methods, such as ammonium salt 
10 fractionation, DEAE chromatography, and the like. 



Monoclonal antibodies are produced by conventional techniques. Generally, the 
spleen and/or lymph nodes of an immunized host animal provide a source of plasma 
cells. The plasma cells are immortalized by fusion with myeloma cells to produce 

1 3 hybridoma cells. Culture supernatant from individual hybridomas is screened using 
standard techniques to identity those producing antibodies with the desired specificity. 
Suitable animals for production of monoclonal antibodies to the human protein include 
mouse, rat, hamster, etc. To raise antibodies against the mouse protein, the animal will 
generally be a hamster, guinea pig, rabbit, etc. The antibody may be purified from the 

20 hybridoma cell supernatants or ascites fluid by conventional techniques, e.g. affinity 
chrumatography using MPTS bound to an insoluble support, protein A sepharose, etc. 
Therefore it is an object of the present invention to provide monoclonal antibodies 
binding specifically to the MPTS proteins of the present invention, more specihTallv 
such antibocuc 1 . which inhibit aggrecanase activity and such antibodies which are 

lr .'iLinian or humanized ones. 



The antibody may be produced as a single chain, instead ot the normal 
multimenc structure. Single chain antibodies are described in lost ct til I 1994) J.B.C. 
269:2o2('i7-73, and others. UNA sequences encoding the variable region ot the heavy 
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For in vivo use, particularly for injection into humans, it is desirable to decrease 
the antigenicity of the antibody. An immune response of a recipient against the 
blocking agent will potentially decrease the period of time that the therapy is effective. 
Methods of humanizing antibodies are known in the art. The humanized antibody may 
5 be the product of an animal having transgenic human immunoglobulin constant region 
genes (see for example WO 90/10077 and WO 90/04036). Alternatively, the antibody of 
interest may be engineered by recombinant UNA techniques to substitute the CHI, 
CH2, CH3, hinge domains, and/or the framework domain with the corresponding 
human sequence (see WO 92/02190). 

10 

The use of Ig cDNA for construction of chimeric immunoglobulin genes is 
known in the art (Liu ct al ( 1 987) P.N.A.S. 84:3439 and ( 1987) ). Immunol. 1 39:352 1 ). 
mKNA is isolated from a hybridoma or other cell producing the antibody and used to 
produce cDNA. The cDNA of interest may be amplified by the polymerase chain 

1 5 reaction using specific primers (U.S. Patent Nos. 4,683,195 and 4,683,202). 

Alternatively, a library is made and screened to isolate the sequence of interest. The 
DNA sequence encoding the variable region of the antibody is then fused to human 
constant region sequences. The sequences of human constant regions genes may be 
found in Kabat et al ( 1991 > Se quences of P roteins of Im munological I n terest , N.I.H. 

20 publication no. 91-3242. Human C region genes are readily available from known 
clones. The choice of isotype will be guided by the desired effector functions, such as 
complement fixation, or activity in antibody-dependent cellular cytotoxicity. Preferred 
isotypes are IgGl, lgG3 and IgG4. Hither of the human light chain constant regions, 
kappa or lambda, maybe used. The chimeric, humanized antibody is then expressed by 

25 conventional methods. 

In yet other embodiments, the antibodies maybe fully human antibodies. For 
example, xenogeneic antibodies which are identical to human antibodies may be 
employed. By xenogenic human antibodies is meant antibodies that are the same has 
30 human antibodies, i.e. they are fully human antibodies, with exception that they are 
produced using a non-human host which has been genetically engineered to express 
human antibodies, e.g. WO 98/50433; WO 98,24893 and WO 99/53049. 
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Antibody fragments, such as Fv, F(ab')2 and Fab may be prepared by cleavage of 
the intact protein, e.g. by protease or chemical cleavage. Alternatively, a truncated gene 
is designed. For example, a chimeric gene encoding a portion of the F(ab')2 fragment 
would include DNA sequences encoding the CHI domain and hinge region of the H 
5 chain, followed by a translational stop codon to yield the truncated molecule. 

Consensus sequences of H and L J regions may be used to design 
oligonucleotides for use as primers; ro introduce useful restriction sites into the { region 
for subsequent linkage of V region segments to human C region segments. C region 
10 cDNA can be modified by site directed mutagenesis to place a restriction site at the 
analogous position in the human sequence. 

Expression vectors include plasmids, retroviruses, YACs, EBV derived episomes, 
and the like, A convenient vector is one that encodes a functionally complete human 

15 CH or CL immunoglobulin sequence, with appropriate restriction sites engineered so 
that any VH or VL sequence can be easily inserted and expressed. In such vectors, 
splicing usually occurs between the splice donor site in the inserted J region and the 
splice acceptor site preceding the human C region, and also at the splice regions that 
occur within the human CH exons. Polyadenylation and transcription termination 

20 occur at native chromosomal sites downstream of the coding regions. The resulting 

chimeric antibody may be joined to any strong promoter, including retroviral I TRs, t\v 
SV -4<> earlv promoter, iOkayama et n!. ' 1V83( Mol. Celj.,H]o. 3:2Sn), Rous sarmma 
virus [ I K ((. iorman et i./. : >AS2) i' N.A.S 7S>:6777 \ and moloney murine leukemia vin^ 
I 1 U i ( Jros.scnedl ct al. ; 1 c i S 3 > ( 'elj 4 1 : S S 3 > ; native Ig promoters, etc. 

In yet other embodiments ot the invention, the active agent is an agent that 
modulates, and generally decreases or down regulates, the expression of the gene 
encoding the target protein in the host, l or example, antisense molecules can be used {• > 
down - regulate expression of MPTS in cells The aim' s-.-nsr revert nnv !'i ,; ->-»v." 

targeted gene, and inhibits expression ot the targeted gene products. Anttsense 
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molecules inhibit gene expression through various mechanisms, e.g. by reducing the 
amount of mRNA available for translation, through activation of RNAse H, or stenc 
hindrance. One or a combination of antisense molecules may be administered, where a 
combination may comprise multiple different sequences 

5 

Antisense molecules may be produced by expression of all or a part of the target 
gene sequence in an appropriate vector, where the transcriptional initiation is oriented 
such that an antisense strand is produced as an RNA molecule. Alternatively, the 
antisense molecule is a synthetic oligonucleotide. Antisense oligonucleotides will 

1 0 generally be at least about 7, usually at least about 12, more usually at least about 20 
nucleotides in length, and not more than about 500, usually not more than about 50, 
more usually not more than about 35 nucleotides in length, where the length is 
governed by efficiency of inhibition, specificity, including absence of cross-reactivity, 
and the like. It has been found that short oligonucleotides, ot from 7 to 8 bases in 

15 length, can be strong and selective inhibitors of gene expression (see Wagner et al 
(1996), Nature Biotechnol 14:840-844). 

A specific region or regions of the endogenous sense strand mRNA sequence i;, 
chosen to be complemented by the antisense sequence. Selection of a specific sequence 
20 for the oligonucleotide may use an empirical method, where several candidate 

sequences are assayed for inhibition of expression of the target gene in an in vitro or 
animal model. A combination ot sequences may also be used, where several regions ol 
the mRNA sequence arc selected for antisense complementation. 

25 Antisense oligonucleotides may be chemically synthesized by methods known m 

the art (see Wagner et al ( 1993), supra, and Milligan et al, supra.) Preferred 
oligonucleotides are chemically modified from the native phosphodiester structure, in 
order to increase their intracellular stability and binding affinity. A number of such 
modifications have been described in the literature, which alter the chemistry of the 

30 backbone, sugars or heterocyclic bases. 

Among useful changes in the backbone chemistry are phosphorothioates; 
phosphorodithioates, where both of the non-bridging oxygens are substituted with 
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sulfur; phosphoroamidites; alkyl phosphotriesters and boranophosphates. Achiral 
phosphate derivatives include 3 , -0'-5 > -S-phosphorothioate, 3 , -S-5'-0- 
phosphorothioate, 3'-CH2-5 , -0-phosphonate and 3 , -KH-5'-0-phosphoroamidate. 
Peptide nucleic acids replace the entire ribose phosphodiester backbone with a peptide 
5 linkage. Sugar modifications are also used to enhance stability and affinity. Thea- 
anomer of deoxyribose may be used, where the base is inverted with respect to the 
natural j3-anomer, The 2'-OH of the ribose sugar maybe altered to form 2'-0-methvl 
or 2'-0-allyl sugars, which provides resistance to degradation without comprising 
affinity. Modification of the heterocyclic bases must maintain proper base pairing. 
10 Some useful substitutions include deoxyuridine for deoxythymidine; 5-methyl-2'- 
deoxycytidine and 5-bromo-2 , -deoxycytidine for deoxycytidine. 5- propynyl-2'- 
deoxyuridine and 5-propynyl-2'-deoxycytidine have been shown to increase affinity and 
biological activity when substituted for deoxythymidine and deoxycytidine, respectively. 

15 As an alternative to anti-sense inhibitors, catalytic nucleic acid compounds, e.g. 

ribozymes, anti-sense conjugates, etc may be used to inhibit gene expression. 
Ribozymes may be synthesized in vitro and administered to the patient, or may be 
encoded on an expression vector, from which the ribozyme is synthesized in the 
targeted cell [for example, see International patent application WO 9523225, and 

20 Beigelman et al (1995), Nitcl. Acids Res. 23:4434-42). Examples of oligonucleotides with 
catalytic activity are described in WO 9506764. Conjugates of anti-sense ODN with a 
metal complex, e.g. terpyridylCu(II), capable of mediating mRNA hydrolysis are 
described in Bashkin et al. (1995), Appl. Biochem. Biotechnol. 54:43-56. 

25 It is heretoix an object at the present invention to provide a method of 

modulating MPTS activity in a host, said method comprising: administering an effective 
amount of an MPTS modulatory agent to sa;d hoit, or more specifically such a method 
wherein said modulatory agent is a small molecule, an antibody, or a nucleic acid. 

30 As mentioned above, an effective amount of the active agent is administered to 
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e.g. aggrecanase, activity, as measured by aggrecan cleavage product production, as 
compared to a control. 

In the subject methods, the active agent(s) may be administered to the host 
5 using any convenient means capable of resulting in the desired modulation of MPTS 
activity, e.g. desired reduction in aggrecan cleavage product production. Thus, the agent 
can be incorporated into a variety of formulations for therapeutic administration. More 
particularly, the agents of the present invention can be formulated into pharmaceutical 
compositions by combination with appropriate, pharmaceutical^ acceptable carriers or 
10 diluents, and may be formulated into preparations in solid, semi-solid, liquid or gaseous 
forms, such as tablets, capsules, powders, granules, ointments, solutions, suppositories, 
injections, inhalants and aerosols. 

As such, administration of the agents can be achieved in various ways, including 
1 5 oral, buccal, rectal, parenteral, intraperitoneal, intradermal, transdermal, intracheal.etc, 
administration. 

In pharmaceutical dosage forms, the agents may be administered in the form ot 
their pharmaceutical!)' acceptable salts, or they may also be used alone or in appropriate 
20 association, as well as in combination, with other pharmaceutically active compounds. 

For oral preparations, the agents can be used alone or in combination with 
appropriate additives to make tablets, powders, granules or capsules, for example, with 
conventional additives, such as lactose, mannitol, corn starch or potato starch; with 
23 binder^ such as crystalline cellulose, cellulose derivatives, acacia, corn starch or gelatins; 
with disintegrators, such as corn starch, potato starch or sodium 
carboxymethylceHulose; with lubricants, such as talc or magnesium stearate; and if 
desired, with diluents, buffering agents, moistening agents, preservatives and flavoring 
agents. 

30 

The agents can be formulated into preparations for injection by dissolving, 
suspending or emulsifying them in an aqueous or nonaqueous solvent, such as vegetable 
or other similar oils, synthetic aliphatic acid glycerides, esters of higher aliphatic acids or 
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propylene glycol; and if desired, with conventional additives such as solubilizers, 
isotonic agents, suspending agents, emulsifying agents, stabilizers and preservatives. 

The agents can be utilized in aerosol formulation to be administered via 
5 inhalation. The compounds of the present invention can be formulated into pressurized 
acceptable propellants such as dichlorodifluoromethane, propane, nitrogen and the like. 

Furthermore, the agents can be made into suppositories by mixing with a variety 
of bases such as emulsifying bases or water-soluble bases. The compounds of the present 
10 invention can be administered rectally via a suppository. The suppository can include 
vehicles such as cocoa butter, carbowaxes and polyethylene glycols, which melt at body 
temperature, yet are solidified at room temperature. 

Unit dosage forms for oral or rectal administration such as syrups, elixirs, and 
1 3 suspensions may be provided wherein each dosage unit, for example, teaspoon ful, 
tablespoonful, tablet or suppository, contains a predetermined amount of the 
composition containing one or more inhibitors. Similarly, unit dosage forms for 
injection or intravenous administration may comprise the inhibitor; s > in a composition 
as a solution in sterile water, normal saline or another pharmaceutically acceptable 
2! i carrier. 

The term "unit dosage form/' as used herein, refers to physically discrete units 
suitable as unitary dosages tor human and animal suL^e^ls, eaJi unit containing a 
pi determined uuantitv of compounds of the present invcatiui; calculated in an annum: 
1 s sufficient to produce the desired etlect 1:1 association with 1 pharma^cutically acceptable 
diluent, earner or vehicle. I he specifications for the novel unit dosage forms of the 
present invention depend on the particular compound employed and the effect to be 
achieved, and the pharmacodynamics associated with each compound, in the host. 



ac.ents. stabilizers, wcltmg a^n's and the like, are : cadilv ava:lahie t 1 the :uiH:_ 
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Where the agent is a polypeptide, polynucleotide, analog or mimetic thereof, e.g. 
antisense composition, it may be introduced into tissues or host cells by any number ot 
routes, including viral infection, microinjection, or fusion of vesicles. Jet injection may 
also be used for intramuscular administration, as described by Furth et al. (1992), Anal 
5 Biochem 205:365-368. The DMA may be coated onto gold microparticles, and delivered 
intradermal^ by a particle bombardment device, or "gene gun" as described in the 
literature (see, for example, Tang et al. (1992), Nature 356:152-154), where gold 
microprojectiles are coated with the therapeutic DNA, then bombarded into skin cells. 

10 Those of skill in the art will readily appreciate that dose leveis can vary as a 

function of the specific compound, the severity of the symptoms and the susceptibility 
of the subject to side effects. Preferred dosages for a given compound are readily 
determinable by those of skill in the art by a variety of means. 

1 5 The subject methods find use in the treatment of a variety of different disease 

conditions involving MPTS activity, including disease conditions involving aggrecanase 
activity. Of particular interest is the use of the subject methods to treat disease 
conditions characterized by the presence of aggrecan cleavage products, particularly 60 
kl)a aggrecan cleavage products having an ARCS N-terminus. Specific diseases that are 

JO characterized by the presence of such methods include: rheumatoid arthritis, osteo- 
arthritis, infectious arthritis, gouty arthritis, psoriatic arthritis, spondolysis, sports 
injury, joint trauma, pulmonary disease, fibrosis, and the like. 

By treatment is meant at least an amelioration of the symptoms associated with 
25 the pathological condition afflicting the host, where amelioration is used in a broad 
sense to refer to at least a reduction in the magnitude of a parameter, e.g. symptom, 
associated with the pathological condition being treated, such as hyperphosphatemia. 
As such, treatment also includes situations where the pathological condition, or at least 
symptoms associated therewith, are completely inhibited, e.g. prevented trom 
30 happening, or stopped, e.g. terminated, such that the host no longer suffers from the 
pathological condition, or at least the symptoms that characterize the pathological 
condition. 
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A variety of hosts are treatable according to the subject methods. Generally such 
hosts are "mammals" or "mammalian," where these terms are used broadly to describe 
organisms which are within the class mammalia, including the orders carnivore {e.g. t 
dogs and cats), rodentia {e.g., mice, guinea pigs, and rats), and primates {e.g., humans, 
5 chimpanzees, and monkeys). In many embodiments, the hosts will be humans. 

Kits with unit doses of the active agent, usually in oral or injectable doses, are 
provided. In such kits, in addition to the containers containing the unit doses will be an 
informational package insert describing the use and attendant benefits of the drugs in 
10 treating pathological condition of interest. Preferred compounds and unit doses are 
those described herein above. 

Fmaly it is an object of the present invention: 

A method of screening to identify MPTS modulatory agents, said method 
comprising: 

contacting an MPTS proteins as defined herein with a substrate in the 
presence ot an potential modulatory agents; and 

determing the effect of said modulator}' agent on the activity of said protein 

I lie method as defined in (1), wherein said substrate comprises a glu-ala 

bond . 

i he method ai defined in '. n >. wherein .said substrate is au«;re^a:i < a 

[ragmen; thereof. 

Furthermore, a method ot treating a host suffering from a disease condition 
associated with MPTS activity specifically wherein said disease condition is 
characterized by the presence of aggrecan cleavage prducts, said method comprising: 

I .... . . . ! \ " ' ' I ' . 
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specifically wherein said disease condition is characterized by the presence of aggrecan 
cleavage products, like arthritis. 

Examples 

5 Example 1 

A nucleic acid array carrying 699 known metalloproteinase genes and novel 
ESTs available in public and proprietary databases was designed. These sequences on 
the array were selected by a search with a seed set of known metalloprotease protein 
10 sequences from all species. These protein sequences were used to find matching 

sequences in human nucleotide at the protein (codon) level. Redundant sequences were 
eliminated, remaining sequences assembled and clustered, and the unique set of 699 
sequences were arrayed. 

15 The resultant array was used to screen genes expressed in primary cultures of 

chondrocytes. A fair number of metalloproteinases known to be expressed by these cell:* 
were identified. However, a number of HSTs for novel proteins were also identified. 
Using these ESTs in subsequent database mining and PGR protocols, four different 
human MPTS proteins were identified, i.e. MPTS15, MPTS10, MPTS19 and MPTS20. 

20 

Example 2 

Expression of MPTS-10 

25 An example of a system for expression of mpts-10 is the COS-7 mammalian cell 

system. The nucleotide sequence that encodes mpts-10, including the secretion signal 
sequence, was ligated into a pcDNA3.1 plasmid (In Vitrogen, Carlsbad, CA, USAj. 
Two micrograms of the resulting plasmid was combined with lipofectamine (Life 
Technologies, Rockville, MD, USA). The mixture was then added to COS-7 cells, which 

30 were grown in 6 well plates to a density of approximately 90% confluency. After 6 

hours, fresh medium was added to the cells and after 24 hours the cells were washed and 
fresh serum free medium containing bovine aggrecan (O.lmg/ml, Sigma, St. Louis, MO, 
USA) was added. The cells were incubated for an additional 48 hours. Five hundred 
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microliters of culture fluid from each well was collected and concentrated ten fold. Two 
microliters of chondroitinase ABC and keratinase (10 u/ml, Sigma, St. Louis, MO, USA) 
was then added and the samples incubated overnight at 37C. The samples were then 
boiled in SDS-PAGE sample loading buffer, electrophoresed on a polyacryamide gel and 
5 transferred to a PVDF membrane. A Western blot using an antiserum against a 
neoepitope generated when aggrecanase cleaves aggrecan was then performed. 

Another example of a system for expression of mpts-10 was the baculovirus 
expression system. The UNA sequence that contained the coding sequence for mpts- 10 
10 (including the sequences that code for the secretion signal sequence) and that had been 
cloned in the pcDNA3.1 vector was modified bv PGR so that the coding sequence and 
the translational stop eodon were Hanked by the Not 1 (N-terminal side) and Sfi-1 (C- 
terminal side). The primer used for the N-terminal end was 

GATC GCGGCCG CTATGGTGGACACGTGGCCTCTATGGCTCC and the primer for 

1 3 the C-terminal end was 

T(M GGCCTTCAGGGCC GATCACTGTGCAGAG( ACTCACCCCAT. After 
amplification using standard PGR methods, the fragment was digested with Not 1 and 
Sfi-1. The digested fragment was hgated into a vector pVI.1392-U, which had also been 
digested with Notl and Sfi-1. PVT. I 392-U in a derivation of the baculovirus transfer 
20 plasmid, pVLl 392 ( PharMingen, San Diego, CA CSAj in which the multiple cloning site 
has been modified to contain Not-1 and Sfi- 1. The overhangs generated by digestion 
with Not-1 and Sfl-1 were complementary to the overhangs generated in the Not 1 and 
Sti 1 digested POP amplified DNA The ligated DNA was transformed into bacterial 
cells and a clone was selected that contained tlv: pl.isrmd and the correct mpts-lt) 

2 ^ sequence. Tins plasmid was produced and purihed. The mptsCO sequence was 

transferred into a baculovirus vector using standard techniques ( Huculovirus F.xprcsston 
Vectors: A Laboratory Manual by David O'Reilly, Lois Miller, and Verne Luckow, W.H. 
Freeman and Co., New York, USA) Five plaque purified virus preparations were 
produced from the virus preparation. Sls> insect cells growing m suspension were 
infected with each of the phone purified v::rs pr-p ir UMris at a rviltipli.-i'v of n ^ 
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chondroitinase ABC and keratina.se ( lOu/ml) at 37C overnight. The samples were then 
examined by Western blotting using an antiserum that reacts with a neoepitope 
generated when aggrecan is cleaved by aggrecanase. 

5 Another method for expression of mpts-10 was the drosophila expression 

system. The DNA fragment containing the sequences encoding mpts-10 and flanked by 
Not-1 and Sfi-1 that had been generated by PGR (see above) was cloned into plasmid 
Cmk 33. Cmk33 is a plasmid derived from pMK33/pMtHy (Li, Bin et ai Biochem J 
I 1996) 313, 57-64) so that Not-1 and Sfi- 1 were in the cloning site. The overhangs 

10 generated by digestion of this plasmid are compatible with the overhangs generated in 
the digested DNA containing the mpts- 10 fragment (see above). A plasmid containing 
the correct sequence of mpts-10 was amplified and purified. Drosophila (S2) cells were 
transformed with the plasmid using standard techniques (Li, Bin et al Biochem ] ( 1996) 
313, 57-64). Culture fluid was collected 2 days alter transfection. These samples were 

13 assayed for aggrecanase activity by incubating with bovine aggrecan (Sigma, St. Louis, 
MO, USA) at a concentration of 0.1 mg/ml. The samples were then incubated with both 
chondroitinase ABC and keratinase (lOu/ml) at 37C overnight. The samples were then 
examined by Western blotting using an antiserum that reacts with a neoepitope 
generated when aggrecan is cleaved by aggrecanase. 

20 

Lxarnpje 3 

E xpression of MPTS- 15 

23 An example of a system for expression of mpts- 15 is the COS-7 mammalian cell 

svstem. The nucleotide sequence that encodes mpts- 15, including the secretion signal 
sequence, was ligated into a pcDNA3T plasmid (In Vitrogen, Carlsbad, CA, USA). 
Two micrograms of the resulting plasmid w r as combined with lipofectamine (Life 
Technologies, Rockville, MD, USA). The mixture was then added to COS-7 cells, which 

30 were grown in 6 well plates to a density of approximately 90% con fluency. After 6 

hours, fresh medium was added to the cells and after 24 hours the cells were washed and 
fresh serum free medium containing bovine aggrecan (0.1 mg/ml, Sigma, St. Louis, MO, 
USA) was added. The cells incubated for an additional 48 hours. Five hundred 
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micoliters of culture fluid from each well was collected and concentrated ten fold. Two 
microliters of chondroitinase ABC and keratinase (10 u/ml, Sigma, St. Louis, MO, USA) 
was then added and the samples incubated overnight at 37C. The samples were then 
boiled in SDS-PAGE sample loading buffer, electrophoresed on a polyacryamide gel and 
5 transferred to a PVDF membrane. A Western blot using an antiserum against a 
neoepitope generated when aggrecanase cleaves aggrecan was then performed. 

Another example of a system for expression of mpts-15 was the baculovirus 
expression system. The DNA sequence that contained the coding sequence tor mpts-lb 
10 { including the sequences that code for the secretion signal sequence) and that had been 
cloned in the pcL)NA3.1 vector was modified by PGR so that the coding sequence and 
the translational stop codon were tlanked by the Not 1 (N-terrninal side) and Sfb 1 
(C-terminal side*. The primer used for the N-terminal end was 

GATC GCGGCCGC TATGGAAATTTTGTGGAAGACGTTG and the primer for the C- 
1 5 terminal end was TGAGGCCTTCAGGGCCGATC;'ITAAAGCAAAGTITC;T TTTGGT. 
After amplification using standard PCP methods, the fragment was digested with Not 1 
and Sfi-1 . The digested fragment was hgated into a vector pVLLW-U, which had also 
been digested with Notl and Sfi- 1. PVl.KW-U is a derivation of the baculovirus 
transfer plasmid ; pVLl392 ( PharMmgcn, San Diego, CA, USA.) in which the multiple 
Jo cloning site has been modified to contain Not- 1 and Sfi- 1. The overhangs generated by 
digestion with Not-] and Sfl-1 were complementary to the overhangs generated in the 
Nut 1 and Sfi khgested PCR amplified DNA The hgated DNA was transformed into 
bacterial cells an.i a clone was selected that contained the plasm; J and the correct mpts 
! sequence. This plasnml was produced and pun lied. I he mpts- IS sequence was 
■ transferred into a baculovirus vector using standard technique •> [ Ihwitiovtrus i.xprc»ion 
V \\ lory. A I nhomtory Matnuil by I )avid ^ ) 1 seilly, 1 » >i^ Miller, anil V erne Luckow, \\ H. 
1-reeman and Co., New York, USA). Five plaque purified vir us preparations were 
produced from the virus preparation. St9 insect cells growing in suspension were 
infected with each ot the plaque purified virus preparations at a multiplicity of 0.3. 
vt ("ulture fluid was harvested at 3 davs after infect:. ^ These sim.plrs were agreed for 
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examined by Western blotting using an antiserum that reacts with a neoepitope 
generated when aggrecan is cleaved by aggrecanase. 

Another method for expression of mpt.s-15 was the drosophila expression 
5 system. The DNA fragment containing the sequences encoding mpts- 1 5 and flanked by 
Not- 1 and Sfi- 1 that had been generated by PCR (see above) was cloned into plasmid 
Cmk 33. Cmk33 is a plasmid derived from pMK33/pMtHy (Li, Bin et al Biochem J 
( 1 996) 313, 57-64) so that Not- 1 and Sfv 1 were in the cloning site. The overhangs 
generated by digestion of this plasmid are compatible with the overhangs generated in 

10 the Not 1 and Sfi 1 digested DNA containing the mpts- 15 fragment (see above). A 
plasmid containing the correct sequence of mpts-15 was amplified and purified. 
Drosophila (S2) cells were transformed with the plasmid using standard techniques {Li, 
Bin et al Biochem J (1996) 313, 57-64). Culture fluid was collected 2 days after 
transfection. . These samples were assayed for aggrecanase activity by incubating with 

15 bovine aggrecan (Sigma, St. Louis, MD, USA) at a concentration of 0.1 mg/ml. The 

samples were then incubated with both chondroitinase ABC and keratinase (10u/mli at 
37C overnight. The samples were then examined by Western blotting using an 
antiserum that reacts with a neoepitope generated when aggrecan is cleaved by 
aggrecanase. 

20 

Example 4 

Expression ofMPTS-19 

25 An example of a system for expression of mpts-19 is the COS-7 mammalian cell 

system. The nucleotide sequence that encodes mpts-19, including the secretion signal 
sequence and the C-terminal stop codon, was ligated into a pcDNA3. 1 plasmid (In 
Vitrogen, Carlsbad, CA, USA). Two micrograms of the resulting plasmid was 
combined with lipofectamine (Life Technologies, Rocheville, MD, USA). The mixture 

30 was then added to COS-7 cells, which were grown in 6 well plates to a density of 

approximately 90% confluency. After 6 hours, fresh medium was added to the cells and 
after 24 hours the cells were washed and fresh serum free medium containing bovine 
aggrecan (0.1 mg/ml, Sigma, St. Louis, MO, USA) was added. The cells incubated for an 
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additional 48 hours. Five hundred microliters of culture fluid from each well was 
collected and concentrated ten fold. Two microliters of chondroitinase ABC and 
keratinase (10 u/ml, Sigma, St. Louis, MO, USA) was then added and the samples 
incubated overnight at 37C. The samples were then boiled in SDS-PAGH sample 
loading buffer, electrophoresed on a polyacryamide gel and transferred to a PVDF 
membrane. A Western blot using an antiserum against a neoepitope generated when 
aggrecanase cleaves aggrecan was then performed. 

Anutlici example of a system for expression or mpts- 19 was the baculovirus 
expression system. The DNA sequence that contained the coding sequence for mpts-19 
(including the sequences that code for the secretion signal sequence) and that had been 
cloned in the pcDNA3. 1 vector was modified by PGR so that the coding sequence and 
the translational stop codon were flanked by the Not 1 (N-terminal side) and Sfi-1 
(C-terminal side). The primer used for the N-terminal end was 
GATC GCGGCCGC TATGCCCGGCGGCXa^AGTCCCCG and the primer for the 
C- terminal end was 

TGA GGCCTTCAGGGCC GATCTCAGCGGCGGGCAACCCGCTG. After 
amplification using standard PGR methods, the fragment was digested with Not 1 and 
Sfi- 1. The digested fragment was hgated into a vector pV[T392-U, which had also been 
digested with Notl and Sfi-1. PVL1392-U is a derivation of the baculovirus transfer 
plasmid, pVLl392 (PharMingen, San Diego, CA USA.) in which the multiple cloning site 
has been modified to contain Not- 1 and Sfi - '. The overhangs generated bv d:»estion 
with Not-1 and Stl - 1 were complementary to the overhangs generated in the Not 1 and 
Sfi 1 digested PC 'R amplified DNA. The ligated DNA was transformed into bacterial 
cells and a clone wa> .selected that contained *he plasmid and the correct mp:s-19 
.sequence, 'i his plasmid was produced and purifed. The mpts-19 sequence v%as 
transferred into a baculovirus vector using standard techniques ( Baculovirus Expression 
Vectors: A laboratory Manual by David O'Reilly, Lois Miller, and Verne Luckow, W'.H 
Freeman and Co., New York, USA). Five plaque purified virus preparations were 
produced Irom the virus preparation Sf9 insect cells iTowin!' m syspep^ , P \,-.. r ,- 
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a concentration of 0.1 mg/ml. The samples were then incubated with both 
chondroitmase ABC and keratinase ( lOu/ml) at 37C overnight. The samples were then 
examined by Western blotting using an antiserum that reacts with a neoepitope 
generated when aggrecan is cleaved by aggrecanase. 

5 

Another method for expression of mpts-19 was the drosophila expression 
system. The DNA fragment containing the sequences encoding mpts-19 and flanked by 
Not-1 and Sfi-1 that had been generated by PGR (see above) was cloned into plasmid 
Cmk 33. Cmk33 is a plasmid derived from pMK33/pMtHy (Li, Bin et al Biochem I 

10 ( 1996) 313, 57-64) so that Not-1 and Sfi-1 were in the cloning site. The overhangs 
generated by digestion of this plasmid with Not 1 and Sfi 1 are compatible with the 
overhangs generated in the digested DNA containing the mpts-19 fragment (see above). 
A plasmid containing the correct sequence of mpts-19 was amplified and purified. 
Drosophila (S2) cells were transformed with the plasmid using standard techniques {Li, 

13 Bin et al Biochem J (1996) 313, 57-64). Culture fluid was collected 2 days after 

transection. These samples were assayed for aggrecanase activity by incubating with 
bovine aggrecan (Sigma, St. Louis, MO, USA) at a concentration of 0.1 mg/ml. The 
samples were then incubated with both chondi uitinase ABC and keratinase ( 10u/ml) at 
57C overnight. The samples were then examined by Western blotting using an 

20 antiserum that reacts with a neoepitope generated when aggrecan is cleaved by 
aggrecanase. 

Example 5 

2.1 Ex p ression o fMPTS-2 0 

An example of a system for expression of mpts-20 is the COS-7 mammalian cell 
system. The nucleotide sequence that encodes mpts-10, including the secretion signal 
sequence and the C-terminal stop codon, was Lgated into a pcDNA3.1 plasmid (In 
30 Yitrogen, Carlesbad, CA, USA). Two micrograms of the resulting plasmid was 

combined with lipofectamine (Life Technologies, Rockeville, MD, USA). The mixture 
was then added to COS-7 cells, which were grown in 6 well plates to a density of 
approximately 90% confluency. After 6 hours, fresh medium was added to the cells and 
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after 24 hours the cells were washed and fresh serum free medium containing bovine 
aggrecan (O.lmg/ml, Sigma, St. Louis, MO, USA) was added. The cells incubated for an 
additional 48 hours. Five hundred micoliters of culture fluid from each well was 
collected and concentrated ten fold. Two microliters of chondroitinase ABC and 
5 keratinase ( 10 u/ml, Sigma, St. Louis, MO, USA) was then added and the samples 
incubated overnight at 37C. The samples were then boiled in SDS-PAGE sample 
loading buffer, electrophoresed on a polyacryamide gel and transferred to a PVDF 
membrane. A Western blot using an antiserum against a neoepitope generated when 
aggrecanase cleaves aggrecan was then performed. 

10 

Another example of a system for expression of mpts-20 was the baculovirus 
expression system. The DNA sequence that contained the coding sequence for mpts-20 
(including the sequences that code for the secretion signal sequence) and that had been 
cloned in thepcDNA3.1 vector was modified by PGR so that the coding sequence and 

1 5 the translational stop codon were flanked by the Not 1 ( N-terminal side) and Sfi- 1 

(C-terminal side). The primer used for the N-terminal end was 
GATC GCGGCCGC TGCGCTGTGATGAGTGTGCCTG and the primer for the 
C-terminal end was 

TGA GGCCTTCAGGGCC GATCTTATAAAGG( X.TTGAGA A A AGAG After 
2d amplification using standard PGR methods, the fragment was digested with Not 1 and 
Sfi-1. The digested fragment was Hgated into a vector pVLl392-L\ which had also been 
digested with Notl and Sfi-:. PVI.1392 U is a derivation of the baculovirus transfer 
plasmid, pVI.KW ( PharMingen, Sin Diego, CA C\-\< i p. which the multiple Coning si:e 
has oven modihed to contain No: 1 and Sfi - !. The overhangs generated bv digestion 

2 3 with Not 1 and Stl- 1 were complementary to the overhangs generated in the Not 1 and 

Ml I digested PGR amplified DNA. The ligated DNA was transformed into bacterial 
cells and a clone was selected that contained the pla>mid and the correct mpts-20 
sequence. This plasmid was produced and purified. The mpts-20 sequence was 
transferred into a baculovirus vector using standard techniques ( Bticuhvirus l : xprc><ion 

; W -:^"<- T / ( l/'.T-.-|f(>ry 1 1,... 1 - e ! > . t > < ' . . . ' * 

■ i; o; uie ;\aq 4 .e \ .u.s prep..: it;o;> at a mul:>;\K:tv oD 3. 
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Culture fluid was harvest 3 days after infection. These samples were assayed for 
aggrecanase activity by incubating with bovine aggrecan (Sigma, St. Louis, MO, USA) at 
a concentration of 0. 1 mg/ml. The samples were then incubated with both 
chondroitinase ABC and keratinase ( lOu/ml) at 37C overnight. The samples were then 
5 examined by Western blotting using an antiserum that reacts with a neoepitope 
generated when aggrecan is cleaved by aggrecanase. 

Another method for expression of mpts-20 was the drosophila expression 
system. The DNA fragment containing the sequences encoding mpts-20 and flanked by 

10 Not- 1 and Sfi-1 that had been generated by PGR (see above) was cloned into plasmid 
Cmk 33. Cmk33 is a plasmid derived from pMK33/pMtHy (Li, Bin et al Bioehem j 
(1996) 313, 57-64) so that Not-1 and Sfi-1 were in the cloning site. The overhangs 
generated by digestion of this plasmid with Not 1 and Sfi 1 are compatible with the 
overhangs generated in the digested DNA containing the mpts-20 fragment (see above). 

15 A plasmid containing the correct sequence of mpts-20 was amplified and purified. 

Drosophila (S2) cells were transformed with the plasmid using standard techniques (Li, 
Bin et al Bioehem J { 1996) 313, 57-64). Culture thud was collected 2 days after 
transfection. These samples were assayed for aggrecanase activity by incubating with 
bovine aggrecan (Sigma, St. Louis, MO, USA) at a concentration of U. 1 mg/ml. The 

20 samples were then incubated with both chondroitinase ABC and keratinase ( 10u/ml) at 
37c: overnight. The samples were then examined by Western blotting using an 
antiserum that reacts with a neoepitope generated when aggrecan is cleaved by 
aggrecanase. 

25 Example 6 

P urifica t ion of mpts-10, 15, 19 and 20: 

Mpts-1(), 15, 19, and 20 were purified from the culture fluid of the expression systems 
30 described above using chromatographic procedures. For example, the culture fluid was 
adjusted with regard to pH, filtered and then loaded onto a column packed with 
sulfopropyl sepharose FF (Amersham-Pharmacia Biotech, Piscataway, NJ, USA). After 
washing with a buffer consisting of 10 mM CaCL.O.l M NaCl, and 0.05% Brij35 at a pH 
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which results in retention of the mpts's on the column, the mpts's were eluted with a 0.1 
M to 1 .0 M NaCl gradient. Fractions from the column were assayed for the presence of 
aggrecanase activity as described above and noded. For the purification a column 
packed with phenylsepharose or sephacryl S-200 can also be used. 
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SEQUENCE LISTING 

<110> Hoffmann-La Roche AG 

5 <120> Novel Metal iopro teases Having 

Thrombospondin Domains and Nucleic Acid Compositions 
Encoding the Sane 

-:130> 20594 

10 

•:150-- -50/184,152 

<15i> 2000-02-18 

- : 1 6 0 : - 10 

15 

•:I7Q> XastSEQ for Windows Version 4.0 

• 2 1 x . 

■ 2 1 1 : ■ 9 b 9 
20 -.212:- PRT 

• 213:- human 

Xaa at 9 09 is any amino acid 
X at 211 is any ammo acid 
<22 3> X at 218 is any amino acid 
50 <22 3> Xaa is any ammo acid 



- ? "2 1 
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<400> 1 

Men Glu lie Leu Trp Lys Th: Leu Thr Trp lie Leu Ser Leu lie Met 

1 5 10 15 

Ala Ser Ser Glu Phc Kis Ser Asp His Arg Leu Ser Tyr Ser Ser Gin 
5 20 25 30 

Glu Glu Phe Leu Thr Tyr Leu Glu Mis Tyr Gin Leu Thr He Pro lie 

3 5 4 0 4 5 

Ary Vai Asp Gin Asn Giy Ala Phe Leu Ser Phe Thr Va: Lys Asn Asp 
bQ 5 5 6 0 

10 Lys His Ser Arg Arg Arg Arg Ser Met Asp Pro He Asd Pro Gin Gin 
6 5 7 0 75 8 0 

Ala Val Ser Lys Leu Phe Phe Lys Leu Ser Ala Tyr Giy Lys His Phe 

8 5 90 95 

His Leu Asn Leu Thr Leu Asm Thr Asp Phe Val Ser Lys His Phe Thr 
15 100 105 HQ 

Vai Glu Tyr Trp Giy Lys Asp Giy Pro Gin Trp Lys His Asp Phe Leu 

^ i:!0 125 

Asp Asr. Cys His Tyr Thr Giy Tyr Leu Gin Asp Gin Arg Ser Thr Thr 
13C :35 140 

20 Lys Val Ala Leu Ser Asn Cys Val Giy Leu His Giy Vai lie Ala Thr 
- 1-1 1 ' .> 1 j 5 5 ' f 

Glu Asp Glu Glu Tyr Ph<-.- i lr- Gl": Pro Leu Lys Asr, Thr Th r Glu Ann 

^ IP-r :h- 
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Leu Thr Glu Asp Gin Pro Asn Leu Giu lie Asn His His Ala Asp Lys 
3C5 310 315 320 

Ser Leu Asp Ser ?he Cys Lys Trp Gin Lys Ser He Leu Ser His Gin 
325 3 30 335 

5 Ser Asp Gly Asn Thr He Pro Glu Asn Gly He Ala His His Asp Asn 
340 345 35C 

.Ma Val Leu He Thr Arg Tyr Asp He Cys Thr Tyr Lys Asn Lys Pro 

3 5 5= 3d 0 3 6 5 

:.:ys Gly Thr Leu Gly Leu Ala Ser Val Ala Gly Ker. Cys Glu Pro Glu 
10 370 375 3S0 

Arg Ser Cys Ser- He Asn Glu Asp He Gly Leu Gly Ser Ala Phe Thr 
SQS 39 0 40C 

He Ala H;.s Glu He Gly His Asn Phe Gly Me;: Asn His Asp Gly He 
4 0 5 4 10 4 15 

15 :;ly Asn Ser Cys Gly Thr Lys Gly His Glu Ala Ala Lys Leu Met Ala 
420 42 3 430 

Ala His He Thr Ala Asn Thr Asn Pro Pho Ser Trp Ser Ala Cyf Ser 

4 3 5 14 0 4 4 :> 

A.rg Asp Tyr He Thr Ser Phe l.eu Asp .Ser: Gly Arrq Gly Thr Cys Leu 
20 4 50 4 55 460 

Asp A. sr. G 1 u ? re: ?rc> Lys At t_: Asp Oh-"- L^-u P/r ? r r« AH Va : A L a Pro 
,0 5 4 "0 4" 5 481 

Gly Gin 7 a 1 Tyr Asp A 1 a Asp Glu G . :: 0:ys Arc: One Gin Tyr Gly Ala 

4 3 5 4 0 0 4 9 :, 

25 Thr Ser Art;: Sin Cys lys Ty: 01y Glu "/a i '0-\- Arg Glu Leu Trp Cys 
5 00 H : - 5 1 ■. 

Leu Ser Lys Ser As:: An.; Cys A; : Thr Asn 3er Tie Pre Ala Ala G-u 

515 C; 525 

Sly Thr Leu Cys (Or Tnr Gly ^sn : le Glu lys Gly Trp Cys Tyr Gin 
30 33 0 :o:, 3.0 

Gly Asp Cys 0'al Pre Phe Gly Thr Hp Pro OLa Ser lie Asp Gly Gly 
54 5 3 ^0 5 5 5> 56 0 

Trp Gly Pro Trp Ser Leu Trp Gly Giu Cys Ser Arg Tnr Cys Gly Gly 

5 6 0 5 "; 0 5 7 5 

35 Gly Va 1 Ser Ser Ser Leu Arg His Cys Asp Ser Pro Ala Fro Ser Gly 
580 5 0 -j 500 

Gly Gly Lys Tyr Cys Leu Gly Glu Arg Lys Arg Tyr Arc Ser Cys Asn 

595 60 0 605 

Thr Asp Pro Cys Pro Leu Gly Ser Arg Asp Pie Arg Glu Lys Gin Cys 
40 610 615 620 
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Ala 
625 
Pro 

5 Glu 



10 



20 



I'h: 
" 8 "j 
AH 



Asp 
Tyr 
Giy 



Thr Gin 



His 
5 90 
Cys 



Lys 

Arg 
7 05 
Gly 



\j Gin rle 



Ser Ly 



25 Glu Gli 



Phe Asp Asn 

Thr Gly Gly 
645 

Tyr Asn Phe 
66C 

Cys Asn Ala 

675 

Val Gly Cys 
Arg Val Cys 



Phe Phe Asn Asp 

7 2 5 

Pro Arg Gly 
74 0 

As- Tyr He 

7 5 5 

ly Ala Trp Thr 
7 0 

Met Phe K: s Tvr 



Gly Pro rhr 

6 0 5 

820 



Mer. Pro 
630 

Giy Val 
Tyr Thr 

Asp Ser 
Asp Asn 

Gly Giy 

7 L0 

Ser Leu 



Phe Arg Gly 



Lys Pro 



Glu 

Leu 
G 30 
I le 



Aia Leu Lys 



He Asp 

Lys Arc; 

7 -J j 

Lei Giu 

He Arg 



Arg 

665 
Asp 



uys 
650 
Ala 

lie 



Leu Gl\ 



Asp Gly Ser 



Pro Arg 



Ser Val His He 



j „y 
Hu 



Ser 



Thr Ast: 



Lys Tyr Tyr Asn 

6 35 

Ala Leu Asn Cys 

Pro Ala Val He 

670 

Cy-; He Asn Gly 

685 

Ser Asp Aia Arg 

7 00 

Thr Cys Asp Ala 

7 1 5 

Gly Tyr Met Glu 

Val Arg Glu Val 

75 0 

Giy Asp Asp Tyr 

' 5 5 

Lys Phe Asp Val 

vyo 

Giu Pro Glu Ser 

■ ' -) 5 

Val Xoi Va. Leu 



Ly- 



Trp Lys 
640 
Leu Ala 
655 

Asp Gly 

Glu Cys 

Glu Asp 

He Giu 
72 0 
Val Val 
735 

Ala Met 
Tyr He 
Ala Gly 

H i.) j 
I.-.. g: :-: 



Cvs Leu Leu Lys Lys L 



3 8 : 

897 
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Tyr Leu Glu Gly Gly Leu Phe Ala Phe Arc Glu His lie Leu Gly 
945 950 955 



<210 
<211 
<212 
<21'}. 



2379 
DNA 



<400> 2 

dt^gaaactt tgtggaagac gttcacctgg a:::tgagcc tcatcaLyyu tceatcggaa 



•SO 

tttca^agty accacaggct ttcstacagt tctraagagg aattcctyc 

:..!0 



" f a t. c t t y a a 



oactaccagc -aactattcc aataagggtt gatcaaaatg gagcatttc: eagctttact 

; \o 

gtga.-i, t^atg ataaaractc aaggagaaga eggagtatgg accctat tea t ecacagcag 
.": :) 

gcag' , r ■ m. a «c:ttat LLtt taaactttca ccc:5tggca agcacr.ruv. tctaaact ty 

■ ) j 

ar. " ~ : .-:a.i:.Ti cayat t r. t. g t giccaaucac 1 1 T aca gtag aatar. tggyy ^aaayatrr.-.ja 



•ftyga r:aea . ya t r; t 1 t tayacaac tgtrartaca caggatar 
1 o 



;r. a ;■ a -a, ; ■ ■'. g : ggc 1 1. : aay ::aac gn : :;;;gt : c:ca : gg : g • ^' ^icLc:, 



uaaga \yu<..rj agl a<: 1 1 f a t. ojaacct L ta aayaat^cca cagayga'..; 



a^r.M-.gaaa «• uyceaecc t.eatgt. ta~t tacaaaaagt ozgcccztc acaacgacat 

ctgta~gatc actctcartg tgygytttcg gatttearaa gaagtggca-.: <■.<: yytyg 
h 0 

c t gaavqaca en tccac t g t ttcttattca ctaccaatta acaacacaca tatceaccac 

ayucayaaga oaccagtgag cattgaaegg tttgtggaga cactggtagt ggcagacaaa 
' 7 !■■ 0 

atgatygtgc gctaceatgg ccgcaaagac attgaacatt acattttgag tgtgatgaat 
3 4 0 

attyttgeca aactttaccg tgattccagc ctaggaaacg ttgtgaatat tatagtcccc 
9i'C 
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cgcttaattg ttctcacaga agatcagcca aacttggaga taaaccacca tgcagacaag 
96 0 

tccctcgata gcttctgtaa atggcagaaa tcca:cccct cccaccaaag tgatggaaac 
1020 

5 accattccag aaaatgggat tgcccaccac gataatgeag ttcttattac tagatatgat 
1080 

atctgcactc ataaaaataa gccctrjtgga acactgggct tggcctctgt ggctggaatg 
1140 

rgtcagcctg aaaggagctg cagcattaat gaagacat t.g gcctgggttc agcttttacc 
10 1200 

atlgcacatg agateggtea caattttggt atgaaccatg atggaattgg aaatcctrg: 

12 6 0 

gggacgaaag gtcatgaagc agcaaaactt atggcagctc acat:ac:gc gaataccaat 

13 20 

15 cctctttcc: gytctgcttg cacccgagac r.aca^ caeca gcttccta^a tccaggccgc 
13 BO 

ggtacr.r.gcc r.:.gar.aaLga gcctcccaag cgtgaccttc t ttateca-gc t gtggcccca 
1 f. 4 C 

ggtcaggr.gr. atgatgctya rgagcaatgt cgtt tccagt atggagcasc cteccgccaa 
2U 1500 

tgtaaatatc gggaagtoLg ragaqagcrc : ggr.gtc ca cicaaaagc.ja c:gc;g:g;c 

: s 6 o 

accaacacta r.rccaucarjc tyaggggaca rrr.gr.rir.caaa c tyygaatat tgaaaaaggg 

' : -*~ )' ' : ' ag-zgav'" t:u t r <•■: * r * f gg :arv - • aca t uggggc 

16 8 0 



1 8 GO 

gagaaacagt qtgeagacr. r :. .;avan' atg :cc::ag gaaagrarra taactggaaa 

1 1 j 2 fJ 
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gctagggaag atagatgtcg agtctgtgya ggggacggaa gcacatgtga tgccattgaa 
2160 

gggctcttca acgattcact gcccagggga ggctacatgg aagtggigca gataccaaga 
2220 



ggcCctgttc acattgaagt tagagaagtt gccatgtcaa agaa::atar. tgcrtttc 
2230 

rctgaaggag atgattacta tattaatggt gcctggacr.a r.t gazrtggcc t aggaaa:ct 
2 3 4 0 

gatytr.gctg ggacagcttt ::attacaag agaccaactg ac gaaccaga acccttggaa 
2 £ 0 0 

gctctaggtc ctacctcaga aaatc:catc gtcatggttc ';g:*::caaga acagaacttg 

2:160 

' : ;gaar.agct a^aagttcaa :gttcccatc act:gaactq gcagtygaga Laatgaagtt 
2520 

; :iqci:-.aca: ggaatcatca gtcttggtca gaatgctcag c:ic::gtgc tgyaggtaag 
2 5 H 0 

.-. r.gcc t.:ac:a gycagcccac ccagagggca agatggagaa ca ^aa-aca: r -tgagctat 
2 S * 1 0 

^"^-q'.gr: h .gt:t:aaaaaa g:caa:igga aaca:t:c:r. gcaggtt r.gc ::caagctg: 

2 " i :■ 0 

•.-.attt^ccaa aagaaacr. t : gc-ii Laatta :a::aca^:: a r. 1 1 g 1 1 1 : c:aric;c:caccj 

2 " 7 • : 0 

'■aa:::".cr,gc aya t ; t g : g ycaaaat.aca :c:tggcaca a r. gag tg t c 
2 8 . C 



_yetggryc 



:aag a;:: ra:.ci: :ja aggtgcryctg t::gcc:::.';c gr.g ~a --a --a ■ c t ': g g tat 



IS 0 



2 1 

:211 
:212 
:213 



?RT 
■■ human 



<4 00 3 

Me:. Ala Pre Ala Cys Glr. lie Leu Arg Trp Ala Leu Ala Lei: Gly Leu 

15 10 15 

'i-y 2eu Met Phe Glu Val Thr His Ala ?he Arg Ser Gin Asp Glu Phe 

20 25 30 

2eu Ser Ser Leu Glu Ser Tyr Glu lie Ala Phe Pro Thr Arg Val Asp 
3 5 4 0 4 5 
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His Asn Gly Ala Leu Leu Ala Phe Ser Pro Pro Pro Pro Arg Arg Gin 

50 55 60 

Arg Arg Gly Thr Gly Ala Thr Ala Glu Ser Arg Leu Phe Tyr Lys Val 
65 70 75 80 

5 Ala Ser Pro Ser Thr His Phe Leu Leu Asn Leu Thr Arg Ser Ser Arg 
85 9C 95 

Leu Leu Ala Gly His Val Ser Val Glu Tyr Trp Thr Arg Glu Gly Leu 

ICO 105 110 

Ala Trp Gin Arg Ala Ala Arg Pro His Cys Leu Tyr Ala Gly His Leu 
10 LIS 120 125 

Gin Gly Gin Ala Ser Ser Ser His Val Ala He Ser Thr Cys Gly Gly 

13 0 13 5 140 

Leu His Gly Leu He Val Ala Asp Glu Glu Glu Tyr Leu He Glu Pro 
145 15 0 155 160 

15 Leu His Gly Gly Pro Lys Gly Ser Arg Ser Pro Glu Glu Ser Gly Pro 

16 5 17 0 175 

His Val Val Tyr Lys Arg Ser Ser Leu Arg His Pro His Leu Asp Thr 

18 0 18 5 19 0 

Ala Cys Gly Val Arg Asp Glu Lys Pro Trp Lys Gly Arc Pro Trp Trp 
20 195 2 00 2 05 

Lou Arg Thr Leu Lys Pro Pro Pro Ala Arg Pro Leu Gly A sr. Glu Thr 

210 ;> : 5 2:0 

Glu Arg G~y G L r. Pro Go/ Leu Lys Arc Ser Val Ser Arg Glu Arq Tyr 
2 2 5 2 0/ 235 2 4 0 

23 Val G.i; Thr Leu Val V-i : A 1 a Asp Lys Met Met Vc. 1 Ala Tyr His Gly 



2 9 0 3 r c 

His His Ala Gly Lys Ser Leu Asp Ser ?hc Cys Lys Trp Gin Lys Ser 

3C5 -H 31b 320 



CA 02332533 2001 -02 16 



54 



Me" Cys Glu Arg Glu Arg Ser Cys Ser Val Asn GIu Asp lie Giy Leu 

370 375 380 

Ala Thr Ala Phe Thr lie Ala His Glu lie Gly His Thr Phe Gly Met 
385 390 J95 400 

A s.i His Asp Gly Val Gly Asn Ser Cys Gly A] a Arg Gly Gin Asp Pro 

4 0 5 410 415 

Ala Lys Leu Met: Ala Ala His lie Thr Met: Lys Thr Asn Pro Phe Val 

420 425 430 

Trp Ser Ser Cys Ser Arg Asp Tyr He Thr Ser Fhe Leu Asp Ser Gly 

4 3 5 4 4 0 4 4 5 

Leu Giy Leu Cys Leu Asn Asn Arg Pro Pro Arg Gin Asp Phe Val Tyr 

-50 455 450 

Pro Tar Val Ala Pro Gly Gin Ala Tyr Asp Ala Asp Glu Gin Cys Arg 
4 6 5 4 ' 0 4 7 5 433 

Phe G In His Gly Val Lys Ser Arg Gly Leu Gin Arg Ala Val Val Ser 

485 4 9 0 4 95 

Glu Gin Glu Gin Pro Val His Hiu Gin Gin His Pro Gly Arg Arg Gly 

^ C' 1' 5 0 5 510 

His Ala Val Pro Asp Ala His His Arg Gin Gly Val Val Leu Gin Thr 
-I" 5 2 0 o2 5 

■j 3 0 : : 3 5 .3 4 '1 

Gly Ala Val A<rp Ser Met Gly Asp Cys Ser Arg Thr Cys Gly Gly Gly 

54 5 550 :V ; ^50 

Val Ser Ser Ser Ser Arg His Cys Asp Se: Pro Arg Pro Thr Vie Gly 

Asp Asp Cys Pro Pro Gly Scr C '. r. Asp :-':>.- Crg Glu -,'al Gin Cys Ser 

1>?2 5J5 

Glu Phe Asp Ser lie Pre Phe A i\j Giy lys Phe Tyr Lys Trp Lys Thr 

610 615 622 

Tyr Aru Gly Gly Gly Val Lys Ala Cys Ser : eu Thr Cys Leu Ala Glu 

625 6 3 0 6^5 54 0 

Gly Phe Asn Phe Tyr Thr Glu Arg Ala Ala Ala Val Val Asp Gly Thr 

64 5 5 5 0 555 

Pro Cys Arg Pro Asp Thr Val Asp Ire Cys Val Ser Gly Glu Cys Lys 

6 6 0 555 670 

His Val Gly Cys Asp Arg Val Leu Gly Ser Asp Leu Arg Glu Asp Lys 
575 630 635 
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Cys Arc Val Cys Gly Gly Asp Gly Ser Ala Cys Glu Thr He Glu Gly 

69G 695 7 00 

Val ?he Ser Pro Ala Ser Fro Gly Ala Gly Tyr Glu Asp Val Val Trp 
705 710 715 720 

5 lie Pro Lys Gly Ser Val His He Phe He GIr. Asp Leu Asn Leu Ser 
12b 730 T15 

Leu Ser His Leu Ala Leu Lys Gly Asp Gin Glu Ser Leu Leu Leu Glu 

7 -10 74 5 7 50 

Gly Leu Pro Gly Thr Pro Gin Pre. His Arg Leu Pro Leu Ala Gly Thr 

10 7 5 5 7 6 0 7 65 

Thr Phe Gin Leu Arg Gin Gly Pro Asp Gin Val Gin Ser Leu Glu Ala 

7 7 0 7 "/ 5 7 8 0 

Leu G'.y Pro ;le Asm Ala Ser Leu lie Val Met: Val Leu Ala Arg Thr 
'■'8 5 7 90 7 95 80 0 

15 Glu Leu Pro Ala Leu Arg Tyr Arc Phe Asn Ala Pro He Ala Arg Asp 
8 3 5 810 315 

Ser Leu Pro Pro Tyr Ser Trp His, Tyr Ala Pro Trp Thr Lys Cys Ser 

820 82 5 85 0 

Pro Ser Ve 1 Gin Ala Val Ala Arg Cys Arg Arg Trp Ser Ala Ala Thr 

2D 3 3 5 8^0 fU5 

Ly:j Leu Ar p Ser Ser A ; a Va Ala Pro His Tyr Lys Ser A '. <•-. H i s Ser 
r -L . 8 5 5 8 6 0 
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<400> 4 

ttccatccta atacgactca ctatagggct cgagcggccg cccgggcagg tgtggacacg 
6 0 

t.ggce*: ;tat ggctcccgcc tgccagatcc tccgctgggc cctcgccctg gggctgggcc 

i :i : 

cciryv:cga ggtcaegcat gccttccggt ;tcaaga:ga gttcctgtc:: agtctggaga 
\ H J 

gctatg.^a; cgccttcccc acccgcgtgg acca;aacgg ggcar:gc'g gccttctcgc 

:>-; j 

ca:cc:i::cc ccggaggcag cgccgeygca cgggggccac agccgagr.ee cgcctctcct 

3 J J 

acaaag:ggc ctcccccagc acccacttcc :gc:gaa:cr. gaccog:agc icccgcccac 
3-i ) 

tggca-jggca cgr.c teegtg gagtaetgga caegggaggg cc tggcctgg cagagggegg 
^; . ! ) 

cecyg ; rcca ctgcctcr.de gcnggtcacc -yeagggeca ggccagcacc :cc:a:c^jy 

4 > 1 

ccatc-t.7cac ctgtggaggc ctgcacggcc v.gaccgt ggc acargaggaa gag::accr_ga 

ttgacr : 'c:: gcacggcggc cccaacgg *: - r eggag .:cc ggaggaaag- ggac carat g 
t: ;.. t 

Lgg-. g-,i::aa gcgttcc- ctgcgr race ■c'ra-'r' ::ga cacagcctgr. cgagtgagac 
6 • ■ 

atgag t.tacc gMjgaaaggg cggccc teg-, ggctucy ;ar ct tgaageca ccccctccca 
ggece •\ggcj g^tgaarira gugegrggee agecagg ::c t gaag<:uatcg occagccgag 
agege :..,.cgt ggagacect g gtggtcgcig acaagar. ?at ggtggcctat cacgggcgcc 

gggat egga gcag^-.tgtc ctggccatca tgaacat.tgt tgecaaaett ttccaggact 

1 1 

cgagt<--rggg aagcacegt l aacaicctcg r;aa o-_cc ret catcetgete aeggaggace 
9 m ■ 

agcccactct ggagatcacc caccatgccg ggaagtccct ggacagcttc tgtaagtggc 

io:.-c 

agaaatccat cgtgaaccac ageggecatg geaatgecat tccagagaa:: ggtgtggcta 

I C fc 

accatgacac ageagtgetc atcacacgct atgacatctc catctacaag aacaaaccct 

I I 4 C" 
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gcggcacact aggcctggcc ccggtgggcg gaatgcgcga gcgcgagaga agctgcagcg 
12 00 

tcaatgagga cattggcctg gccacagcgt tcaccattgc ccacgagatc gggcacacat 
1260 

5 tcggcatgaa ccatgacggc gtgggaaaca gctgtggggc ccgcggtcag gacccagcca 
1320 

agctcatggc tgcccacatt accatgaaga ccaacccatt cgtgtggtca tc^tgcagcc 
13^0 

ctgac:.acac caccagctr. l cragact,ccjg gcccggggct ccgcc:gaac aaccggcccc 
10 1440 

cca:jacagga cctt.gt.gtac ccgacsgcgg caccgggcca agcc:acgat gcagatgagc 
15 0 0 

aatgccgcL" tcagcatgga gtcaaatcgc gaggtctgca gcgagctgtg gtgtctgagc 
L560 

1 ;> aagagcaacc ggcgcaccac caacagcatc ccggccgccg agggcacgct gtgccagacc 
L62 0 

cacacca:cc acaagggg-g atgctaca<ia eg gg i c t g ig t'jccctCtgg gtcccgccca 
168 0 

yagggugegg aeggagce.g ggggccgtigg a::t. ccav.ijqe: ggactycagc cggacctctg 
20 :7-10 

gcggcgycgt gr.cctc::.ct agccgr.cacr. gcgaoaqcx-c "aggccaacc at.cggggcca 
18 0 0 

agt.icvgt. ;;i ggg'-yagaga agy egg care qrt-ccr.r/iwa --a ::ggatgae cgcccccctg 
1 S 6 0 

2r» gcv cc:jag:;a c: t I c ay ag a a g t gcay tg:. *. - : . i : • • : ; r,u:- v ; *: ccj I. : t . rcg'j.gcr ::a 
: 9 2 C 



L C 
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cccrtccccca ccgtctgcct ctagctggga ccaccLtcca ajtgcgacag gggccagacc 

4oc 

agg:.'jcagag :cccgaagcc ctgggaccga ttaatgcacc tztcatcgtc atggtgctgg 
::4bG 

5 occgjaccga jccgcctgzc ctccgccacc gcntcaacgc c:ccatcgcc cgr_gacr_cgc 
;: 5 . : C 

:gcc: :ccca :tcc:ggcac catgcgcccc ggaccaag-g ctcgcccagt gtgcaggcgg 

:: ; -! C 

-age: : -.ggc :j :aggcggcgg agtgccgcaa ccaagctgga cagclccgcg gtcgcccccc 

10 : .\ ;c 

acca :: Lgca :; igcccacagc aagcttgccc aaaagcaagc gcgccrgcaa ca:ggagcc: 

;: ?--o 

f:gcc: ..■:aag a ctgggt cgr_a ggaact:g:cg ctctgcagcc gcago c tgcg augcaaggcg 
15 t;gc:.'i:' .gcc:; ctcggtcgtg Lgccaagccc cgcgtc cctg ccgcgaagaa aaggcgctgg 

: s.o 

acga-: .gcg r ar.gcccgcag ^rrgcgcccac c:ntec:cag gcctgccacc gccccact'ju 

^ -4 : 0 

ccc: -.ggag tggccgcL:ct cgacLgytc: gagcgcaccc ccagcLgcgg gccgggccrc 

20 : KO 

cqcca cgcg '-.uq^c: !:!.g caagaycucd gaccaccgcg ccaccc:qr; c ccggcgca :.' 
0 

zgczc icccg ccgccaagcc accgyccacc aigcgc-gca aeccgcgcrrcr rtacrcccccg 

ju- i; 

2? gcccg tggg -rggc* ggrga gr.ggggtgrig igcLc:gc = : agr.gcggcgr ccrggcagcgc 
3 1.;? 

cagcg- :;cgg -qcgctxra;: cacjeeaeacg ggccaggcgt cgcacgagt :: cacggaggcc 
; : - :) 

c eg eg-; cage cgact.anr.aa gccL.cgt.cga cccgggaact aa:tccgga,: cggtacctge 
30 :j,o 

aggcgcacc.i gc:r.t.cccta tag 

j2- 3 



55 <211 

, - o - ^ 

<2 2 3 



• 1690 

• PRT 

■ huir.ar 
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<400> 5 

Pro Val Pro Ala Met Pro Gly Gly Pro Ser Pro Arg Ser Pro Ala 

15 10 15 

Leu Leu Arg Pro Leu Leu Leu Leu Leu Cys Ala Leu Ala Pro Gly 
5 20 25 30 

Fro Gly Pro Ala Pro Gly Arg Ala Thr 3Lu Gly Arg Ala Ala Leu 

3b 4 0 4 5 

lie Val His Pro Val Arg Val Asp Ala 31 y Gly Ser Phe Leu Ser 
50 55 60 

10 Glu Leu Trp Pro Arg * • * r pm Arg Lys Arg Asp Val C~r 
6 5 7 0 7 5 

Asp Ala Pro Ala Phe Tyr Glu Leu Gin Tyr Arg Gly Arg Glu Leu 

85 50 95 

Phe Asn Lei; Thr Ala Asn Gin His Leu Leu Ala Pro Gly Phe Val 
15 100 105 110 

Glu Thr Arg Arg Arc Gly Gly Leu Gly Arg Ala His He Arg Ala 



Pro 



Ala 



As D 



Ser 



11'. 



: 2 0 



Thr Pro Aid Cys His Leu Leu Gly Glu Val t'Jl r: Asp Pro Glu Leu 
1 'it: 13 5 1 4 0 

Jl> Gly Gly Le.j Ala Ala lie Ser Ala Cys Asp Gly Leo Lys Gly Val 



Glu 
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Glu Glu Asp Leu Lys lie Thr His His Ala Asp Asn Thr Pro Lys Ser 
303 310 315 3 20 

Phe Cys Lys Trp Gin Lys Ser lie Asn Met Lys Giy Asp Ala His Pro 
325 330 335 

5 Leu His Hi a Asp Thr Ala lie Leu Leu Thr Arg Lys Asp Leu Cys Ala 
340 3 45 350 

Thr Her Asr. Arg Pro Cys Glu Thr Leu Gly Leu Ser His Val Ala Gly 

3 5 5 3 6 0 3 6 b 

Met Cys Gin Pro His Arg Ser Cys Ser lie Asn Glu Asp Thr Gly Leu 
10 370 37 5 :,80 

Pro Leu Ala Phe Thr Val A La Hi 6 Glu Leu Gly His Ser Phe Gly He 
583 390 395 402 

Gin His Asp Giy Ser Giy Asn Asp Cys Glu Pro Val Gly Lys Arg Pro 
4 0 5 410 415 

15 Phe He Her Ser Pro Gin Leu Leu Tyr Asp Ala Ala Pro Leu Thr Trp 
4 2 0 4 2 5 4 3 0 

Ser Arc Cys Ser Arg Gin Tyr lie Thr Arg Phe Leu Asp Arg Gly Trp 

■13 5 .141..- 445 

Gly Leu Cys Leu Asp Asp Pro Pro Ala Lys Asp .lie He Asp Phe Pre 
-0 4 5 0 460 

46 5 iV-i .;7 5 4 :: ■.' 

G . r. Tyr G-y A .. a I'yr =er A. a Phe Cys Glu Asp Pel Asp Asn Val Cys 

4 r3 5 4 0C 4 05 
Thi 

Ala Ala \'a 1 Asp Gly Thr Arg Cys Gly Glu Asn Lys Trp Cys Leu 

Gly Glu Cys Vc.i Pre Cal G.y Phe Arg Pro GGu Ala Va 1 Asp Giy Giy 

30 53 0 51, 54 0 

Trp Ser G .. y Trp Ser" .3 ; a Trp Ser lie Cys Ser Arg Ser Cvs Glv Ke': 

- 4 ~ : 5 5 0 5 3 5 0 0 

Giy Val Gin Ser Ala Glu Arg Gin Cys Thr Gin Pro Thr Pro Lys 'lyr 

5 5 5 5 ' f 0 5 7 5 

35 Lys Gly Arg Tyr Cys Val Gly Glu Arg Lys Arg Phe Arg Leu Cys Asn 

58 0 5 35 5 00 

Leu Gin Ala Cys Pro Ala Giy Arg Pro Ser Phe Arg His Val Gin Cys 

5 0 3 b 0 0 6 0 5 

Ser His Phe Asp Ala Met Leu Tyr Lys Gly Arg Leu His Thr Trp Val 

40 610 615 620 
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Pro Val Val Asn Asp Val Asn Pro Cys Glu Leu His Cys Arg Pro Ala 

6 2 5 '3 3 0 63 5 64 0 

Asn Glu Tyr Phe Ala Glu Lys Leu Arg Asp Ala Val Val Asp Gly Thr 

645 650 655 

5 Pro Cys Tyr Gin Val Arg Ala Ser Arg Asp Leu Cys He Asn Gly He 

660 665 670 

Cys Lys Asn Val Gly Cys Asp Phe Glu He Asp Ser Gly Ala Met Glu 

675 680 635 

Asp Arg Cys Gly Val Cys His Gly Asn Gly Ser Thr Cys His Thr Val 

10 690 695 7 00 

Ser Gly Tar Phe Glu Glu Ala Glu Gly Leu Gly Tyr Val Asp Val Gly 

70 5 7 10 7 15 720 

Leu He Pro Ala Gly Ala Arg Glu He Arg lie Gin Glu Val AJa Glu 

7 2 5 7 3 0 7 3 5 

15 Ala Ala Asn Phe Leu Ala Leu Arg Ser Glu Asp Pro Glu Lys Tyr Phe 

740 7 4^ 750 

Leu Asn Gly Gly Trp I'hr He Gin Trp Asn Gly Asp Tyr Gin Val Ala 

7 5 5 7 6 0 ■' 6 5 

Gly Thr Thr Phe Thr 7yr A 1 a Arg Arg Gly Asn Trp Glu Asn Leu Thr 

20 "7 0 775 780 

Ser Pro Gly Pre: Thr Lys Glu Pro Va : Trp Te Gl:: j Leu Phe Gin 

Glu Ser Asn Pn: Gly ..'-:! H : s Ty: J : I'y: '1 h r lie -lis Arg Glu Ala 

805 810 BIS 

R ?. 0 e 3 3 8 3 '3 
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Pre Cys Pro Ala Thr Trp Ala Vol Gly Asa Trp Sen Gin Cys Ser Val 
945 950 955 960 

Thr Cys Gly Glu Gly Thr Gin Arg Arg Asn Val Leu Cys Thr Asn Asp 

96d 970 975 

S Thr Gly Val Pro Cys Asp Glu Ala Gin Gin Fro Ala Ser Glu Val Thr 
93 0 985 9 90 

Cys :,er Leu Pro Leu Cys Arg Trp Pro Leu Gly Thr Leu Gly Pro Glu 

9 91 10 00 10 0 5 

Gly Ser Gly Ser Gly Ser Ser Ser His Glu Leu Phe Asn Glu Ala Asp 
10 1010 1015 10 2 C 

Phe lie Pre- His Mis Leu Ala Pro Arg Pre Ser Pro A '. a Ser Ser Pro 
1025 1030 103 5 10-1 0 

Lys Pro Gly Thr Me: Gly Asn Ala Tie Glu Glu Glu Ala Pro Glu Leu 
104 5 105 0 10 55 

15 Asp Leu Pro Gly Pro Val Phe Val Asp Asp Phe Tyr Tyr Asp Tyr Asn 
10 6 0 10 6 5 107 0 

Phe lift Asr Pne Hi 5 Glu Asp Leu Ser Tyr Gly Pro Ser Glu Glu Pro 

1 C ", 5 10 3 7 10 3 5 

Asr Leu Asp Leu A- a Gly Thr Gly Asp Arg Thr Pro Pro Pro His Sor 
-0 ..090 109- 1IC0 

Pro /- 1 ct Ala Lys Glu Glu Gly Val Leu G.y Pro Tip Te: Pro Ser ?:: 
115 5 1 5 J . 1 < 5 

-5 Trp 7:o Sou Glu Ala Gly Arg Ser Pre P:-o Pro Pro S--r Glu Glu Thr 

:> r;; Gly A.sr Pro Lei: lie Asn Plu: Leu Pro Glu Gl... A;:p Th: Pro lie 

Gly Ala Pri Asp Leu Gly Leu Pro Ser Leu Ser- Trp Pro Are Val Sor 
30 11"C lil'o 1187 

Thr Asp G \ ■■, Gin Thr Pre Ala Thr Pro Glu Ser Glu Asn Asp Phe 

1 1 5 1 190 1 1 95 1100 

Pro Val Gl\ Lys Asp Ser Gin Ser Gin Leu Pro Pre Pro Trp Arg Asp 
1205 1 2.10 1215 

3o Arg Thr Asn Glu Val Phe Lys Asp Asp Glu Glu Pre Lys Gly Arg Gly 
1220 1225 1210 

Ala Pro His Leu Pro Pro Arg Pro Ser Ser Thr Leu Pi o Pro Leu Ser 

1251, 22-10 1145 

Pro Val Gly Ser Thr His Ser Ser Pro Ser Pro Asp Val Ala Glu Leu 
40 12 50 22 5 5 12 6 0 
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Trp Thr Gly Gly Thr Val Ala Trp Glu Pro Ala Leu Glu Gly Gly Leu 
1265 1270 1275 1280 

Gly Pro Val Asp Ser Glu Leu Trp Pro Thr Val Gly Val Ala Ser Leu 
1285 1290 1295 

5 Leu Pro Pro Pro lie Ala Pro Leu Pro Glu Mec Lys Val Arg Asp Ser 
1300 1305 1310 

Ser Leu Glu Pro Gly Thr Pro Ser Phe Pro Ala Pro Gly Pro Gly Ser 

1315 132C 1325 

Trp Asp Leu Gin Thr Val Ala Val Trp Gly Thr Phe Leu Pro Thr Thr 
10 13 3 0 1 3 3 [ > ' 3 /. n 

Leu Thr Gly Leu Gly His Met: Pro Glu Pro Ala Leu Asn Pro Gly Pro 

13 45 1350 1355 1360 
Lys Gly Gin Pro Glu Ser Leu Ser Pro Glu Val Pro Leu Ser Ser Arg 

13 6 5 1370 13 7b 

15 Leu Leu Ser Thr Pro Ala Trp Asp Ser Pro Ala Asn Ser His Arg Val 
1 3 8 0 1 3 8 5 1 3 9 C 

Pro Glu Thr Gin Pro Leu Ala Pro Ser Lei: Ala Glu Ala Gly Pro Pro 

1 9 5 1 4 (J 0 14 0 5 

a:,i Asp Pro Leu Val Val Arg Asn Ala Ser Trp Gin Ala Gly Asn Trp 
-0 14 10 1415 142 0 

So r Glu 1:73 Ser Tin Thr Cy:-i Gly Leu Gly Ala 7a 1 Trp Arc Pro Val 

14 2 5 143 0 1 4 3 c ' 1440 
Arg Cys Ser Ser Gly Arg Ago Glu Asp Cys Ala Pro Ala Gly Arg Pro 

1 4 4 5 1 4 r : .455 

'-4 50 :,\( : - 14V. 



P: 



Pr- Cys Leu Ser Trp Tyr Thr S« 

1 5 2 5 1 5 5" 



Thr 
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Cys Ser Ala Pro Cys Gly Gly Gly Val Gin Arg Arg Leu Val Lys Cys 
1 r B 5 1590 1595 16 0 0 

'-'ai Asn Thr Gin Thr Gly Leu Pre Glu Glu Asp Ser Asp Glr. Cys Gly 
16 0 5 1610 1615 

5 His Glu Ala Trp Pro Glu Ser Ser Arg Pro Cys Gly Thr Glu Asp Cys 
162 0 16 2 5 163C 

G]u Pro Val Glu Pro Pro Arg Cys Glu Arg Asp Arg Leu Ser Phe Gly 

1635 1640 1645 

:-':.e Cys Glu Thr Leu Arg Leu Leu Gly Arg Cys Gin Leu Pro Thr Tie 
10 1650 1655 1560 

Arg T'lr Glr. Cys Cys Arg Ser _'ys Ser Pro Pro Ser His Gly Ala Pro 
1665 1570 1675 1680 

.U-r Arg Gly His Gin Arg Val Ala Arg Arg 
16 8 5 1 6 9 0 

15 

<210> 5 

•::;::> 3333 

-:J12> DMA 
•:223> human 

In 

■:-i00:- 6 

t-Lcjyt * ccty ccaigcccgcj L.-ggccwag'; ccccgcagc: ccgcgcc::: gcr.gcccccc 

■ i 0 

:-t rctiictgc Lccrc:gcc:c '.ctggciccc ggcgcccccg gacccgcacc aggacctgea 

25 

< ;: -cagggov gggogg-a j 3 ?.ca {j cecccggizc gagtcgaege ggggggcr.ee 

• :cwt<f::cct aogagcgte gccccgeyea ctgcgcaagc gggatgeate g t gegcega 
2 -t 0 

t.-ecgc/cccg ccrr.cr.f.cir) act azaazn-r egcgggeecg agc-cgcgc-r, caaccieacc 
3 ..: 0 

accaa\:cage acctr.gcr.cgc gcccggc: r gtgagegaga cgcggcggcg cggcggcc:g 

3>;t; 

ggecgegege aeatceggge ecacaccccg gcctgccacc tgcttggcga ggegcaggae 
35 420 

( c:gayc:Lcg agggtggcct ggucgeca tc agcgcccgcg aeggcetgaa aygcytg::c 
43 0 

cagctctcca acgaggacta ctr. cattgag cccctggaca gtgccccggc ccggcccggc 
54 0 



CA 02332533 2001-02- 16 



65 



cacgcccagc cccacgtggt gtacaagcgt caggccccgg agaggctggc acagcggggt 
60 0 

gattccagtg ctccaagcac ctgtggagtg caagtgtacc cagagccgga gcctcgacgg 
660 

5 gagcgttggg agcagcggca gcagtggcgg cggccacggc tgaggcgtct: acaccagcgg 
720 

ccggtcagca aagagaagr.g ggtggagacc ctggnggLay ctgatgccaa aatggtggag 
7 UO 

taecacgyac agccgcaggt tgagagctat gtgctgacca rcaigaaca: ggtggctggc 
10 90 0 

ctgttr.catg accccag:a: tgggaacccc atcoicatca ecattgtgcg cctggtcctg 
900 

ctggaagatg aggaggagga cctaaagatc acgcaccatg cagacaacac cccgaagagc 
T (5 C 

15 ttctgcaagr ggcagaaaag catcaacatg aagggggatg cccatcccc: gcaccatgac 
1 0 ;: 0 

actgccat.cc ' gctcaccag aaaggacctg tgtccaacca tgaaccggcc ctgtgagacc 
103 0 

ccgggaccg" cccatgtggc yggcatgiigc cagccgcacc gcagctgcag catcaacgag 

gacacggccc :c:cgctggc c"ra : t. gt d gcccacgnyc ? cggacacac 1. 1. 1 1. ggca r f 
1 :i 0 J 

■ ■agcatgacg aaagcygca.i r gar; yuj.-uj '-'-eg \ ■. g j aacgaccttt rat.ca c a- c t. 

"Cc»catjc:.cr * jv.acgac:-;c c gc t ;:rcc : < ■ ygt .■ : rc r g ;: a ccc; ccag r a t.a : ;: 

' y 1: 0 



:agtt.ci:ggj .■■:cac;.::cc c 1 1 c t ■.; Cv.;a; i -f. -. ■_ g ': :j -. :: cca cac a c c T :; 

: = 0 0 

:gc-:ctg:gg g;a^cacctc tractccaag ft^arij'-o'.: :"-'gT rjga-gg cacccggtgt 
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tgcgcggg^g agcgcaagcg cttccgcctc tgeaacctgc aggcctgccc tgctggccgc 
130 0 

ccccccttcc gccacgtcea gtgcagccac cttgargcca tgctctacaa gggccggrtg 
13 60 

5 caca:a:ggg cgrccgcggt caatgacgtg aaccc-tgcg agc-gcactg ccggcccgcg 
1 )2 0 

aatgagtact t tgcegagaa gclgcgggac gccgtiggtcg atggca.ccc ctgciac:ag 

1 : )3<) 

g~ccgagcca gc:gggac::t c:g:at:aac ggcarrLgta agaacg-ggg ctgtgac'tc 
10 2')4) 

caga::tgact ccggtgct at ggagga.-cgc zgzggtgrgt gccacggcaa cggctccicc 

2 L o :) 

tgcc-icjccg tgagcgggac c:: rga ;jgag gccga gggcc ~gggg: ilul ygatgrgggy 

2 :. 6 j 

15 ctga "c:cag cgggcgcacg cgagat:cgc atccaagagg z'zqc rgaggc tgccaac'-*:c 

2 2 2 0 

ctggoactgc gg igcgagga cccggEigaag :ac:tcctca atygtggctg gacca:c:ay 
2 2 8 ) 

tggaacgggg ac:acciggt ggcag^gacc acc::cacat -icgcac;;cag yggcaac:gg 
2f) 2 ;4 i 

gaga.riw./'.ca <:g : ccc ;:ggg tec :a::aag cagcctgtc: ?ga::cjgc: gc:gr.r.c:ac 

gagagcaacc ctygyg ;gca cia-.-g ~ jtac acca:ccaca jggaggracg tggecac jac 

2 " t 6 j 

vv-;gtr:r :.;gc ctj':ccc:igit c_c j :g jc-ar :a:gggccc^ rjgacca sgtg ;:a :agtc acc 

2 >2 j 

'i :: '2i-''J^ g-jg-gcanag gca gcci \ac:gc::gg agoggc aege aggyece jtc 
2 R j 

garg.-iggagir <-.e ;g-_gaccc ccL gggccyg oe:gatgacc aaoagagcaa gtgeagegag 
51) .: .; " 

c:agci:r.gcc ctyceagg-g g r yyc..-.-tgy ::. gag: gg cage cg:g:::;:ag c ccc tgcggg 
2"' ; C i 

cc\ g:gggcc tc:cccgccg ggccg-gctc tgeatcegea gcgtggggct ggatgagcaq 

2^60 

35 agcgccctgg agccacccgc ctgtgaacac cttccccgyc cccctactga aac;cct:gc 

2 ill 2 0 

aaccgccatg laccczgzcc ggccacctgg gctgt.gggga actggtctca gtgctcagtg 
28 80 

aca^gtgggg agggcactca gcgccgaaat gtcc:c:gca ccaatgacac cggtctcccc 
40 2 94 0 
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igtgacyagg cccagcagcc agccagcgaa gtcacctgct ctccgccac: ctgtcggtgg 
3000 

cccccgggca cactgggccc tgaaggctca ggcagcggct cctccagcca cgagctcttc 
3060 

5 aacgaggctg acttcatccc gcaccacctg gccccacgcc cttcacccgc ctcatcaccc 
31^0 

aagccaggca ccatg-jgcaa cgccactgag gaggaggctc cagagctgga ccjgjcgggg 
3 1 .-; 0 

cccgtgtttg -ggacgact t. ctactacgac cacaatttca tcaattt:ca cg<aggat.cr_g 
10 3240 

tcc^a>:gggc cctctgagga gcccyatcta gacctggcgg ggacagggga ccggacaccc 
3 3 " 0 

ccacc\Ccica gccgtectge :;gcgccc:cc acyygtagcc c:gtgcc:cc cacagagcct 

3 3 0 

1 :> cc:gc.;gcca aggaggaggg ggtact.ggga cct:ggLccc cgagccc:tg gcctagccag 
3 4.-0 

gccgg;cgc: ccccaccccc accctcagag cagarcoccg ggaacccrcr ga:caa::CC 
34>-0 

;'!..(jcc gdgc aagacacccc c^taggggcr ccagat.cr.t.c ggct.cccjag ecr.g^ccigg 



20 



;■■(■■(": {\cj- :y ■_ ccacLga cjct -rccgcaciacM ccrr.cjceacce cigayag :ca adalgat : : ic 

3 5 : ) u 

<ng *: ". ggc'd uggacagc'v; ..'agcrci'tcr. C cixcct; era : ggcggga jag g-.iccaa'.gag 

g":*' 'c:cigr -> :_ g L t Ltjagg^ m :;a ^gg'ir cgeggagrac * rrrgagacre 
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ccagcttggg acagccccgc caacagccac agagtccc-g ajacccagcc gctggctccc 
■4 200 

agcccggctg aagcggggcc ccccgcggac ccgttgjttg tcaggaacgc cagccggcaa 
-126 0 

- gcgygaaacc ggagcgagcg ctcta:cacc tgtggcrtgg gcgcggcctg gaggccggtg 
432 J 

cgc;gtagct ccggccggga cgagga:tgc gcccccgctg gccggcccca g~ctgcccgc 
4 3 S ] 

cgc-.gc:acc tgcggccctg tgccac^tgg cactcaggca ac:.gyag;.= a gtgctcccgc 
10 4440 

agcgc ggcg gaggtt.cctc agtgcgggac gtgcagtgtg tggacacacg ggacctccgg 
■lb 0 0 

ccactgcggc cctt:cattg Lcagcocggg cctgccaagc cg~ct.gcgea ::ggccc:gc 
4 b -l ) 

13 ggggcocagc cctg:ctcag c:ggtacaca tcrncctgga gggagtzgctc cgaggcctgt 

') 

'rjgcgij'qgtg agcagcagcg r.c t an r. gacc tgcccggagc caggc:~tot;g jaggaggcg 

• . • i O ) 

rr.gci j.|.:cca acac:acccg gccc'.:i;:aa:.- accca^cccr gcacgcagtc gjtggngggg 

:o . m> 

-j j.jgcc agLgjtcagc cccctgiggr. gg-ggr.gi.o-: agcgg;;gcc~ q gr caag"gr 

4 ) 

g^ci.a mccc agac agggcc ccccgaggaa gacagtgacc aglgtgg-jca c-7aggcc*:gg 
* t - '.• ' i 1 

.'^ g ; : : ig-.-: cccggccg-g r gg.rr.i r.-r-g.u: ::,jt r gtg-.gc :-.gc:;;a;:c: : :cccgc r r _ 

g£iq- j ;ga-;: grcijEcc:: cggg:i;:_g.: gagacg< : : g: ■ : ::::':ac^g; c:gc:gccag 
4 MO 

ctgc: : uiccA tc:cg.;acc:a gtgc:.gccgc tcgtgcrc: c ::g.::ccagcca cgccgccccc: 

30 : ' 1 1 '.< 

r cccv;. iggcc a^.cagcgggt l:gccc::cccr: tgaccg-gcc aggatgcaca gaccgaccga 
' . )0 

racic^ii-.cag tgcceaccac gggct g :gc;: ggagotcecg ccccctgcgc ortaatggr.g 

ctaacoeoct-. ctcaotaccc agcagcaggc tgggyacctc ctocccctca aaaaaggta: 

: J i:- 

:trt-.:.atcc :aacagtt:g Cgtaa.;-a: *: t atca:ga:ct tacataaa^g a.jcatctacc 
: d C 

at tccaaagc aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 
40 5 33 8 
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<210> 1 
<211> 947 
<212> PRT 
<2I3> human 



<4 00> 7 

Y.ez Gin Phe Val Ser Trp Ala Thr Leu Leu Thr Lou Leu Val Arg Asp 

i - 10 ;5 

Leu Ala Glu Me: Gly Ser Pro Asp Ala Ala Ala Ala Val Arn Lys Asp 

2 0 2 5 3 0 

Arg Leu His Pro Arg Gin Val Lys Leu Leu Glu Thr Leu Ser Glu Tyr 

3d 4C 4b 

Glu He Val Ser Pro lie Arg Val As:i Ala L.eu Gly Glu Pro Phe Pro 

^0 5b 5 0 

Thr Asp. Val His Phe Lys Arcj Thr .Arg Arc Ser lie As:; Ser Ala Thr 
^ 7.; -5 80 

Asp Pro Trp Pro Ala Phe Ala Sor Ser Ser Sor Ser Ser Thr Ser Ser 
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Leu Asn Ser Gly Leu Ala Thr Glu Ala Phe Ser Ala Tyr Giy Asn Lys 

260 265 270 

Thr Asp Asn Thr Arg Glu Lys Arg Thr His Arg Arg Thr Lys Arg Phe 

2 7 5 2 3 0 2 S 5 

5 Leu Ser Tyr Pro Arg Phe Val Glu 7a 1 Leu Val Val Ala Asp Asn Arg 
290 295 300 

Met Val Ser Tyr His Gly Glu Asn Leu Gin His Tyr lie Leu Thr Leu 
305 310 315 320 

Met Ser Lie Val Ala Ser He Tyr Lys Asp Pro Ser He Gly Asn Leu 
10 325 330 335 

He Asn He Val lie Val Asn Leu lie Val He His Asn Glu Gin Asp 

3 40 3 4 5 350 

Gly Pro Ser He Ser Phe Asn Ala Gin Thr Thr Leu Lys Asn Phe Gys 

3 5 5 3 6 0 3 6 5 

15 Gin Trp Sir. His Ser Lys Asr. Ser Pro Gly Gly lie His His Asp Thr 
3 7 0 37 5 780 

Ala Val Lou Leu Thr Arg Gin Asp He Cys Arg Ala His Asp Lys Gys 
38 5 3 00 i95 '.00 

Asp Tnr Leu Gly Leu Ala Glu Lou Guy Thr He Cys Asp Pro Tyr Arg 

20} 4C5 -Hi. * 1 1 

Ser Cvh Ser he Sor Glu Asp Ser Gly Leu Sei Thr Ala Phi.- Thr He 

Ala His Gl .: Leu Gly His Val Phe Asn Me:; Pro His Asp Asn Asn As:: 

4 3 5 4 4 0 4 4 5 

25 Lys Lys Lys Glu Glu Gly Val Lys Sor Pro Gin His Val Me:. Ala Pro 

4 6 5 4 7 1 4 7 5 4 3 0 

Lys Tyr Ho Thr Clu Fhe Leu Asp Thr Giy Tyr Gly Glu Cys Lo Leu 
3d 407 47 7 

Asr. C, 1 .; Pro Glu Ser Arc Pro Tyr Pro Leu Pro L'a 1 Gin Leu Pro Gly 

:jG0 505 HC 

He Leu Tyr Asn Val Asn Lys Gin Cys Glu Leu He Phe Gly Pro Gly 
515 520 525 

35 ser Gin. Val Cys Pro Tyr Met Met. Gin Cys Arg Arg Leu Trp Gys Asn 
5 3 0 5 3 5 14 0 

Asn Val Asn Gly Val His Lys Gly Cys Arg Thr Gir. His Thr Pro Trp 
54 5 55 0 d55 56 0 

Ala Asp Gly Thr Glu Cys Glu Pro Giy Lys His Cys Lys Tyr Giy Phe 
40 565 5~C 575 
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Cys Val Pro Lys 31u Met Asp Val Pro Val Thr Asp Gly Ser Trp Gly 

580 585 59C 

Ser Trp Ser Pro ?he Giy Thr Cys Ser Arg Thr Cys Gly Gly Gly lie 

595 600 605 

Lys Thr Ala He Arg Glu Cys Asn Arg Pro Glu Pro Lys Asn Gly Gly 

610 615 620 

Lys Tyr Cys Val Gly Arg Arg Met Lys Phe Lys Ser Cys Asr. Thr Glu 

62 5 6 30 6 3 5 6-1 C 

Pre Cys Leu Lys Gin Lys Arg Asp Phe Arg Asp Glu Gin Cys Ala His 

64 5 65 0 -3 5 5 

Phe Asp Gly Lys His Phe Asn He Asn Gly Leu Leu Pro Asn Va.1 Arg 

6 50 6 6 5 6 7 0 

Trp Val Pro GLn Tyr Ser Gly He Leu Me:. Lys Asp Arg Cys Lys Leu 

68 0 6S5 

Phe Cys Arg Val Ala Gly Asn Thr Ala Tyr Tyr Gin Leu Arg Asp Arg 

6^0 695 700 

Val lie Asp Gly Thr Pro Cys Gly Gin Asp Thr Asn Asp He Cys Val 
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Glu Arg Lys Arg Lys Leu Val Cys Thr Arg Glu Ser Asp Gin Leu Thr 

9 00 905 910 

Val Ser Asp Gin Arg Cys Asp Arg Leu Pro Gin Pro Gly His lie Thr 

915 920 925 

5 --lu Pro Cys Gly Thr Asp Cys Asp Leu Arg Trp Ala Thr Val Phe Ser 



9 'to 

Arg Pro Leu 
; <4 5 



: !3 5 



940 



10 



; . 1 0 : - 8 



512 
213 



4036 

DMA 

human 



15 -:4C0:- 8 

gcccg-.ggtg ctqqaqz:.:.u agt. tgagiag taggaacqcc gr.agcag L a ygacaataia 

!;5ia?".taaa r.taagaa :. :;g i latgtitagg at tignacggt agaa:r.gcM rr.attcatcc; 

50 t sigtgggta a Lr.gagaag- -v. gc:ac.<ja. U:gcg:agc tggg- tiLgy, -r.a-n c-i-acc 

: u; 

'• • - ; -• " ; ""- ; : ' ■ ar...i C 4q.-i: ::.g t i gagagrgagg agaaggc u :. a eg r. r. *.ag r ^a 

2 ■ , C: 

.:cgr.gacj«LL :gg:a:5:;;i uuiagat-ggg ggc:,^g:e r. tgtcacg-ga gaagaagcag 

15 

: cc;gg.it.gLc agaggggr gr ::: vegctaa;: c :: ggaac l cagaagtgaa agggggctat 

'! ' ■■ j 

! crtag;:; art get a tug c;:a t r ar ga t rair.,,?.-. gat: gag:a:rga: tggtagtarv 
-i ; . j 

50 gtta*'yuti cattgtcegg agag:£.rar.r. gttgaagagg a:agc:a::a gaagcat:a: 



gatgaggtt ac tgcgtga ggaaatactt gatggcaget tctg:ggaac gagggtt:at 



: c:t:-.ggr.r. agaactggaa Laaaagctag catgtttatt tetaggecta ctcaggtaaa 
35 5' i0 

aaatcagtgc gagcttagcg ctgtgatgag tgtgcctgca aagatiggtag agtacacgac 

6 N 0 

ggggg f: gggg tgggaagcac catgeauett gtatcctggg ccacactgct aacgctcc ;..g 
7: o 
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gtgcgggacc tggccgagat ggggagccca gacgcugcgg cggccgtgcg caaggacagg 
730 

ctgcarccga ggcaagtgaa attattagag accctgagcg aatacgaaat cgtgtctccc 
84 0 

5 atccgagtga acgctctcgg agaacccttt cccacgaacg tccacttcaa aagaacgcga 
90 0 

cgijagrattd actccgccac tgacccc:gg cctgccctcg cctccccctc ccccccctct 

accic--::i:cc<j aggcgcatta ccgcctctct gcctccggcc agcagtttct at t taatctc 
10 1 0 0 

accgcoaatq ccggattt.at. cgcLccacr.g ttcaccg:ca ccctcctcgy gacgcccggg 
1CSC 

grgaa*. caga ccaagr.rtta tcccgaagag gaagcggaac tcaagcactg f :t:.ctacaaa 

l i.;o 

!:> ggeta'.gtca ataccaactc cgagcacacg gccgtcacca gcctctgctc aggaa:gc:g 

12k 0 

gg::ac«i tr.cc cjgtctcacga tggggat in\ z t::a;.cgaac cac:acagr,c r.acggatgaa 

: ?.>'■,:) 

caagaagarg aagacgaaca aaacaaaccc cacaicat t. r auaggcgcag cgccccccag 
20 13:10 

«g<iaa(f«.'«-.-; : -m acagnaag gcatgratiqt qac.-ficct.c.ig aac:acaaaaa i aggca::agt. 

^aa^cti-Lid^:,: o.g-tGdaccag agcaagaaaa t.aggga:::aaa goan a ; ic:ct r: c <.:<.: tig a c 



<::-i.(" : -.i : _' 



• ; • : ".riar.au;;-. r at. "g; gaac *_ z aar. t.g'g^ ;.:;af".aa:.g.i a-'-ig^a' 
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ccctatagaa gctgttc:a: tagtgaagai agtgga'tga gtacagcttc tacgatcgcc 
1980 

catgagctgg gccatgtgtc taaca:ycct catgatgaca acaacaaatg taaagaagaa 

20 n 

5 ggagtt-iaga gtccccagca Cgtcaigccc cceacactga acc-ctacac caaccccrgg 
2 10) 

acgtggicaa agtgtagtcg aaaatatatc accgagtr.tr. -agacactgg ttatggcgag 
21-i J 

tgttr. jctta acgaacctga atccagaccc taccctctgc ctgtccaacn gccaggcar.c 
1 0 2 2 2 0 

c:::a-.:-iacg tgaataaaca atgtgaattg atttttggac caggttctca ggrgtgccca 
22 SO 

cacat 7 :itgc agcgcagacg gctctgg-gc aataacgrca atggagtaca caaaggctgc 
2 3-1-) 

15 ccgac". :agc acacaccctg ggccgatggg acggagcgcg agccLggaaa ccactgcaag 
r at.gg. ■ -. l L L g:.g::cc:aa agaaa r r;ga ! : gcccccgrga cagargga:.-: ctggggaagr 

'.ggag\rcc: rtggaacetg ctccacraaca tgtggagggg gcatcaaaac: agccac ticca 

20 2 =..:■! 

ga^:,g-:,iacf. ga r.-cagaaec aaaaaa- :;q: ggaaaataet. gtgiaggacg ragaatgac a 
2 ■: " 

itcaa r.rc: f;:'aacacgga gccatgtctc aagcagaagc cacacttccg agargaac:.;j 

26-;- 1 

tgtgc* cac: ttgacgggao gcaitiraac 5:caa-gg:c 'igcrjLcccaa ;g:gcgcr:y 

2 7 m 

ycccc- c r;at acagi gga^;: -.c-gatgaag gacrcggrcjca agt Lgctctig cagagtcgcn 
2'/- t ' 

ggcaa :ac:ac: ccractatca gcttcgagac agagtga".ag atggaactcc t:ctggccag 
30 2o.( 

gacac.i;-atg ara tctg:.gr_ ccagggcc;:: tgccggcaeg ctygatigcga tcatgtttta 
23, ;■ 

aacL^.caag rccggagaga taaatgi.ggg gtttgtggtg gcgataat:c ttcatgcaaa 

2 i ■; f: 

35 a::5g:gccag gaacatttaa racag:a::ar catggctaca ataccgtgc: ccgaattcca 

3-: or 

gctggr.gcta ,-caaratcga tgtgeggcag cacagzttct caggggaaac agacgatgac 

3 0 b 0 

aactact.rag crtiarcaag cagtaaaggt gaattcr.tgc taaatggaaa ctttgttgcc 
40 312 0 
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acaatggcca aaagggaaat tcgcattgyg aatgctgtgg tagagtacag tgggtccaag 
3180 

accgccgtag aaagaattaa ctcaacagat cgcattgagc aagaactttt gcttcaggtt 
3240 

5 ttgtcggtgg gaaagttgta caaccccgat gtacgctatt ctttcaatat tccaactgaa 
3 3 00 

gataaacctc agcagt t t ta ctcyaacagt catgggccat ggcaagcatg cagcaaacrc 
3 :■> 6 0 

tgccaagggg aacggaaacg aaaacr.tgit; tgcaccaggg aatctgatca gcttactctt 
10 3420 

:c:g = tc:aaa gargcgatcg gctgccccag cccggacaca ttactgaacc crgcggtaca 
3 4SC 

gactctgacc tgaggcyggc cactgttttc tcaaggcc:: tacaaatgaa c i g tgagagt 
3 54 0 

lr> ctcgcaggag grcccagcag gagaagcaaa aggaggyyat gccggtcctL agltcccc;t 
3 5 0 0 

tct:;gtc';:: caytgaaaca agctttaacc aat.t:c:ccat ccc:c:ggaa ctgactar.cc 

3 o 6 C 

aayacataca tgcccagat: t.ct. tgttcac ctaagaaria aaaatayctci a :. agag ra ' y 
10 372C 

gcucttyccd aaaaaaattc agLt.ya tec" cacrr.act t.crc r ggyragcu-j : :ac:c«ttar 

Mf.r.gagr'-a tgtacgt caaaac : *. g ?. Lr.vr;aaat::i- a,,, ; ay .n-i.nag ayggayaa-'v- 

i u 

3-* :>o 
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<220> 

<27 3 > primer 
<4 0 0> 9 

■jatcgcggcc gctatggtgg acacgtggcc tctatggctc 
41 

10 
4 3 

Artificial Sequence 
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C Claims 

1. An MPTS protein selected from the group consisting of MKTS- 15, MPTS-10, 
MPTS- 19 and MPTS-20, wherein said protein is present in other than its natural 

5 environment. 

2. The protein according to claim 1, wherein said protein has an amino acid 
sequence substantially' identical to the sequence ot SRQ ID NO:()l t 03, 05 or 07. 

10 3. A nucleic acid present in other than its natural environment, wherein said 
nucleic acid has a nucleotide sequence encoding an MPTS protein selected from the 
group consisting of" MPTS - 15, MPTS-10, MPTS-19 and MPTS-20. 

4. A nucleic acid according to claim 3, wherein said nucleic acid has a nucleic acid 
15 sequence that is the same as or substantially identical to the nucleotide sequence ot SEQ 

II) NO:02, 04, 06 or 08. 

5. An expression cassette comprising a transcriptional initiation region functional 
in an expression host, a nucleotide sequence according to claims 5 or 4 under the 

Jo transcriptional regulation ot said transcriptional initiation icgion, and a transcriptional 
termination region functional in said expression host. 

A t eli compi is:ng an expression ^ asset tc a.u >rdmg to elaim 5 as part of an 

e\t rachroim isi imal elerr.eiit <■ >r m.tegi'ated i:ie > the genome ot a ni >st cell as a result ; >t 
_ 1 i nti ikIia t u >n rt saul expulsion cassette into viii.l hns' cel.. 

r Phe cellular progenv of the hi >st cell according to claim 0. 

s. A monoclonal antibody binding, spec iticallv to an MPTS protein according to 

■ urn I 
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isolating said protein substantially free of other proteins 



10. 



An MPTS protein as claimed in claim 1 or 2, whenever produced by the process 



of claim 9. 



1 1. 



A method of screening to identify MPTS modulatory agents, said method 



10 



comprising: 

contacting an MPTS protein according to claim 1 with a substrate in the 
presence of an potential modulatory agent; and 

determining the effect of said modulatory agent on the activity of said protein. 



12. 



The method according to claim 1 1, wherein said substrate comprises a glu-ala 



bond 



15 13. The method according to claim 12, wherein said substrate is aggrecan or a 
fragment thereof 

14. A method of treating a host suffering from a disease condition associated with 
MPTS activity specifically wherein said disease condition is characterized by the 

20 presence of aggrecan cleavage products, said method comprising: 

administering to said host an MPTS modulatory agent, specifically an 
antagonist. 

1 5. Use of a MPTS modulatory agent, obtainable or obtained by the method claimed 
25 in claim 1 1 for the preparation of a medicament for the treatment of a disease condition 

associated with MPTS activity, specifically wherein said disease condition is 
characterized by the presence of aggrecan cleavage products, like arthritis. 
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FIG. 1A 

MITS-15: (2S7S ) bp) 

ATGGAAATTTTGTGGAAGACGTTGACCTGGATTTTGAGCCTCATCATGGCTTCATCGGAAT 

5 TTCATAGTGACCACAGGCTTTCATACAGTTCTCAAGAGGAATTCCTGACTTATCTTGAACA 
CTA':CAGCTAACTATTCCAATAAGGGTTGATCAAAATGGAG-:aTTTCTCA3CTTTAGTGTG 
AA.A.\aTGATAAACACT-:AAGGAG.AAGACGGAGTATGGA-:CCTATTGATC:ACAGCAGGCAG 
TAr-l'l'AAG^TATTTTTr.AAACTTTCAGCGrATGGC.A-^GGACTTTC'ATC PAAACTTGACTCT 
C A A 3 AC AG ATTTT 3 T 3 F 3C A AACAT ITT AC AG TAG AAT ATTG 3GGG AAAG A PG 3 A 3CCCAG 

U TG 3AAACAT 3A ITT FT PAGACAAC F 3TGA F PA 3 AC AGGATAT ITGCAAGAT : AACGTAGTA 
CAA 3TAAAG PG 3CrrrAAGCAACT3:GTTGGG P'PG XYFGGTG T T ATTGG'P A 3 AG AA 3 AT 3A 
A 3 A 3TATTTTA FCG A A ; 3TTTA.AA 3AATAC'"A " ap, -%(XX *TTC 3 AAGCA'FTT'PAG PTATG A* 
AAT- 3GCC AC 3 C TC A PG TTAFTTAC AAAAAGTC FGCCGTTCAA 3 A A C G A C A T C T G F A F G A F C 
ACT :TCATP3rG3GG FTTC 3GATTT PAC AAGAAGTGGCAAACC PTGGTGGCTGAA TGACAC 

5 ATCCACTGTPTCPPAT'P'^ACPACi;:.AATTAAi:AACA';:A!:ATATC::ACi3ACAGACAG.A\GAGA 
TCAGTGAG 3 A P T 3AACGG'LT PG PGGAGACA'iTGGTAGT :-GCAGAOAAAATGAT 3G PGGGCT 
ACC ATi jGC 3 3 3AAAG A< ".'AT P 3AAC ATT A'';ATTT'PG AG F }TG ATi3AATATTGTT 3CCAAACT 
TT'ACCGTGA VTC PAGCCTAGGAAACGTTGTGAATA PTA FAGTGGCCCGC3 FAA PTGTTCTC 
ACAGAAGAT TAG ^CAAACTTGG V3ATAAACCACCA FGO V3ACAAGTCCGTCGATAGCTTCT 

1 ;: " rA AATG3^3A'3AAA3'CCATT3:T3TCCCA^:AAA(T;\;AT )GAJ\A3ACCATTOGA ?A_A.AATGG 

A3iGi:C(r:^T^;GAACA3TGGGCTTGGCCTCT(TiH^;;r.: ;(y'j VATGTGCGAGCC PGA \AGGAGCT 
CCACCATCAATGAAGACACTGGCCTGGGTTCAGCTCTT XPCATTGCACATGAGATTGGCCA 
CY^VPTTTGGTATGAACCATGATGGAATTG* XYAYTTCTTGTGGGACGAAAG ! !XPC ATGAAGCA 
^ OCAAAACTTAT< ;GCAGC r ''C'ACATTACTGCG AAT A '3' YV^TGO^TTTTCCTGGTCYGGYTGC A 
' :;:C;;A( ^ACTACATGACCAGCTTTC 1Y.GA7TCAGGO :GT« JGTAGTTGCCTTi JATAATC JAGCC 
•TCCCAPAGCCyFGAOTYTCYY™^ 

^'^^vgcac^^ 

AGCPXPTGGyGTcT^OAGCAAAAGCAACCGGTGTG FGAC \:. iACAGTATYC'CAGCAi JCTCXYGGG 
FYT GGC ACT3YX ;• !CC'CAG ACiC A'FAGA/PGv XXXX.: G(H.;G" 'C.'CCTGG PCACCATG-: XGGAGAGF 

G'-'-V. ;c/i( x;ac xxpgcgggggaggcgtcg'G': y yppcc ta. XT-/ :a: ' t g~g ac agtco ax ; cax "'• 



cgg;jvc;c:acxaic:tgatgccv\ttgaaggg'P"P' y-tcaatgat^cyxptgccca^ggggaggctac 




r» ...Li:.: ? Ul\dV/JJ ^\ 
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GTCTCTGCTGGTGCTTCTCCCAAGACTATCTTGAAGGTGGGCTGTTTGCCTTTCGTG.2iAC 
CATTCTTGGTAT 
( SEQ ID NO : 02 } 



Border Lddh^r Cjcrvais LLP 
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■'EG. i B 

>ORF( frame +1) 



5 MEILWKTLTWILSLIMASSEFHSDHRLSYSSQEEFLTYLEHYQLTIPIRVDQMGAFLSFTV 
r^OKHSRRRRSMDPIDPQQAVSKLr FKLSAYGKHFHLNLTLrJTDFVSKKFTVEYWGKDGPQ 
WKHOFIjDNCHYTGYLODORSTTKVALSNCVGLHGVIATEDEEYFIEPLF^JTTEDSKKFSYF: 
N G KE'HVI Y K K S A G OQ RH LYDHS H C G V S D FT R S G X P WW Li 1 E ' T S T V S Y S L P I NT JT K I H HRQK R 
SVG ] ERF VETL\VADKM>IYGYHGR:<DI EKY I LSVMNI VAKLYRD3SLG:.-VV1JI ivarl I VI , 

10 TEDC-PNL E I Mr: H AE)K S L D S F 0 KWQ X S I L S HO S DON T I PENG I AH HDN A\ 'LI 'EE Y D 1 CTYKK 
K ?: IGTLG L AS V AGMC E ?E R S IS 1 NEDTGLGSAFTI AHE EGHNFGEttlHDGIGNSCGTXGHE A 
AKLEEAAR ITANTNPFSWSACSRE'Y I TSFLDSGRGTCLDNEPPKR3FLY PAVAPGQVYDAER 
QGFFQYGATSRQCKYGEVCRELWCESKSNXCVTNS I PAAEGTLCQTGKI EKGWCYQGDCV P 
FGTW ?Q S I DGGWG PWS LWG EGSRT EGGGVS S S LRKCDS PAPSGGGKYC I -GERKRYR SCNTD 

15 PG PGGSEDFREXQGADFDN!l?FRGXY r YNWX?YTGGGVXPGALN^ u 
GTi;m:^AI-SLDICINGECKHV'JCDi\ILG3DARED?j:RVCGGDGSTGDAIRGFKNDSLPRGGY 
ME'/YQI PRGSVHIEVREVAM3KNY IALXSEGDDYY L NGAWTI DWPRKFE'VAGTAFHYKRPT 
DE PE3LEALG P'T S ENL I VKV ■ LQ E'jNLG E RYK FNV ? ITRTG S G E)N E VG F TWNHQ S WS EC S A 
TC AG G KM P TRQ PTQ RARWRTX H I L S Y A L C I , V K K L I G r J I S C R F A S S C N LPXETLL* L Y Y I P F 

20 VFNLM*FVQICW*NTS^NEGLC>. T CFSQDYLEGGLFAFR.EHILG 
(SEC ID NO C 1) 
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FIG. 1C 

Align MP15 with ADAMTS - 6 ( in public data base) ■ 

5 vpi 5-4ur:i versa 1 + i_ORFR MErLVJKTLTi;iL.SLIMASSErHSDHKLSY.7SQEEFLTYLEHVC'LTIPIRV 
ADAPTS 6 * l„ORrl ME I LV.'KTLTv ;i L S L I MAS SE :H S DH RL SYGSQEEFLTYLEH YQL7 I P I RV 

MP I :> - 4u r. iversaM _0 R F 1 DQMG A F L S F Y VK: C D KH S RRR R S MD ? I D PQOA" / S KL F F ? . L 3 A YG KH FKLN L 
10 ADAMT:>6*--_.CRF1 DQMGAFLSFYVKXDFHSRRRRSMDPTDPRXAVSKLFF? L.1AVGKHFH1NL 



KPl/ri -^ur.iversaL-l.QRFl TLNTl)FVSK:TT'''EYV.'GKDG?O^KHDFL J .')\'("HYTGYLr D' jPSTTKV'ALSN 
i 5 AOAM7R6 rl_0RFI TLNT3rV^KKFTV-FY;«Kr:G?QV;KHDFL^:;CHYTGYLC D^RSTTKVALSN 



20 



Mr J l S ■ v:iuversa 1 ~1_.0RF I CVGLHGV I A7FDF F YF IF?!. RG'RTEDSKHR 3 Y EiRGH PHV 7. YF.K RALiR-QRH 
ADAI-17S6 t-l_0RFl CVGLHGV I A7EDE EYF IE PLKNTTEDS KHF 3 V F.XGH PHV i YF.K S : ALQQRK 



YPIR -.^universal- 1 ORFi :„YD:-:SHCGV< Z)F^RRGKPVA:lf;D73TVS\R G.G0I NN7HI HAR0KFSV3 IFF 

•\d,v-:tj;6^i„drf: :.yd-is:icgv5 d ft rf g:< pwv;:,mdt stvs y a: 7 p : mi; y h i h:- •: fof fs vu : e r 

M?I ::- 4 universal * I ORFI F7E7 7YVADr 07 M\ ' G Y H G A K : ) : F H Y I L S VMI ; 7 V A F 7 Y R T. S R I ,G K\ AOJ .: I V A 

R7AYFS6- 1 ORF*. FVETLWADr.t-IMVG-: HGRF7 7 EH Y I F3VMI ; : YAAAYRL S ; AGFA. AOJ7 I YA 

yp;:: j,u::iver * ; _..oo- ; rr. yr:/rfoo. • xef ::;hhadfs:res?ckv;qf;s : FSHCfSc g.; :•: fpngeafa 

RYA7I7FR. * : „CRF7 RR7OG7F0R; OAR : : OiHAORSOFS FCFWQF.S I 73 HO37GO07 ? r MGTA-i:: 

p _ E>--.u:*.i versa -* i_0RF : onaga; 7 par: ic : yf napggyz.goaryagkoepfrfcsr: oo :g:.gsaft 

35 -.pay- „crf; pma t ;g:yry: : c :' y ? u k p c 37 age a s vagi :c e ? e rfc s :: :f.d : o i gs af t 
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API :. - - : ver. S^IG ...ORF - 



a h e : g h o r v :•: f o : g :-j s : G ; : ■: r a- .- ■. a r 

AYR.; ',YY '. YYROR 10" PS RGR. .YYR . R: 



API : : ver £ = .+ F 

ADA.YY: 6 • ORRI 



45 



MP I n nr. LversD ■ 

adamt? 6 . : ORF ; 



r'Y 'E VCR FY 'GO. L, 5 F SNRG Y 7XS I ? AAE377 CQ 7CX T E F.( V 0Y7GF0YPF 
0 A.- FVCRE: AGL3FS7JPAX'7XS: PAAEGGACQ7GXIEFA0 CY'jGDCYPF 



50 MP lb - 4ur:versa i - '.. _.CRF : GTV:?GS I 0G:A\GP07 G'OG E 0 S F7 G G G G V 3 A 3 A A A C 7 SPA> i GGGKYGLGE 
AL)AM7= 6-l_ORFl G TO' ? ; A S 1 0 G R- G F ' V." S LW G R C S F . T G G G G V F S S OR H C C 5PA: SGGGKYGRGF 

MP IS- 4ur. i versal-^i_GRFl ?.KRYFSGR[YLPGI'LGSRFRRFFQGADFD:^;?FRGKYY^V>7YTGGGVFFG 

3^ aoa:-:ys6 + :._orfi f k r y f . s c - rr d ? c p l g s r d f r e k q c ad f d \t : p f r g k \ y revv > pytgggvfpg 

FIG . 1G (CorRi 



^v/Ri"; U,.ihtrCierYuisLLP >., , 
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Align MP15 with ADAMTS-6(in public data base): 

MP1 5 - 4ur.iver sal + 1_0RF1 ALNCLAFGYMFY TERAPAV: DGTQCXADSLTTG I NG EC KKVG C DM I LG S D 
ADAMTS6 + 1._0RF1 ALNCLAFGYNFYTERAPAVI DGTQCMADSLDIC INGECKHVGCDNILGSD 



M ? 1 r ) - -3 un i ve r s a 1 -r 1 _ORF 1 A P. E DRCF VCGGDG STCDA I EGFFXDSL PRGGY M EW(J I PRG SVH I EVREV 

A DAI-ITS 6 +■ l_ORF 1 AREDRCR VCGGGGSTCDAI EGFFNDSl PRGGYMEWC'I PRGSV:-: I EVREV 

10 

MPlii-.;univorsal <- 1 . ORF1 AMS h-M Y I ALKS EGDDYY JN'GAWT : DW ? R K F D V AG T A FKYKRPTDEPESLK 

ADAMTS6 <- 1 _ORFi AMSKMYI ALKSEGDDY Y IKGAW7 I DVi ?RK FDVAGTAFHYKR PTJEPESLE 



KPl'j - 4universa 1 + 1 ORE 1 ALG PT3 ENLI VMYI ,LQEQNI.G I R Y K FNV P 1 PRTG S G r N E7G FTWNH Q SW 5 

A I.) At IT S6+1 __0P. F 1 ALG PTS ENLTVKVLLQEQKLGI P. Y K FNVPI TRTGSG L NEVG FTWKHQ ?V,'S 

21) MF'if>-4u::i versa 1 + l_OR? i ECSATCAvGGKMPTKQPTQRAR'.v'RTKHILSYA: /C1LKKLIGXISCRFASSC 

A E i AM T S 6 + 1 _ _ ( J K F 1 ECS AT C AG G K K PTR Q PTQ R A R W R 7 K HILSYALCLL K KLT G :•; I SCRFASSC 



M P 1 l j - -I un i vo rsal ■ 
ADAMTS6+1 ORE 1 



ORE : 



NLPKi 



<;seq 13 



CI 



No 



A: 1 4 c 6 / 4 ; 
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FIG. 2 A 



MP 10 -full -length 

S 

tic :atc 3tpaatacgact 3a;:tatagg sctcgagc 3Gccg3CCGGgcaggtgtggacacgt 
ggc :tc tatggctcccgcctgcgagat 3ctccgc f 3g ^ :c gtcg 3Cgtgggggtg 3g 3Gtc 

AT 3PTCGAGGTCACG 3ACGCCT rCCG 37 CTC AAGA 7GAGT3CCT 3TGCAGTGTG 3AGAGCT 

atga 3A r :gccttc :gga 3 cgg 3gtggacc ac aa 3gggggactg 3tggccttctcgg 3acc 

KJ TC 3 PC-' 3 C 3GGAGGCA 3 3G 3CGCGGCA 3GGGG 3 3CA 2! AGGCGAGT 3 3CGCCT 3777 TAG AAA 
GT 3 3GC P 33GG CCA 3 2 ACGG ACTTC 3TGCTGAACC ? 3ACGCGG AG 2T2 333GTCTAC PGG 3AG 
GG 73ACG 3 3TCCG7 3 3AGTACTG 3A - AG 3GGAG 3G 3 C7GGCCT 3G 3AT3 AGGGCGG 3CCGGCC 
CCAGTGGGTG TAGG 2" 3G PCACCT 3GA 3GG : 3A3G 3 GAG GAG 3TCC 3ATGTGGC 3ATCAGC 
AC 3TG7GGAGGCCTG : ACGGCC7GA 2 3GTGG 3A 3A I'GAGGAAGAGTA 3CTGATTGAGCCC 2 

1 5 TGCACG 3 FGGGCCCAA 3GGTTGTGGGAGGGG 3GA 3GAAAGTGGACCA 3ATGTGG rGTACAA* 
GCGTTG 3 PC PCTGCG 7 3AGGGGGA 3 3 FGGA 3ACA 3CC P5TGGAG rGAGAGATGA 3AAAGGG 
TGGAAA 3GGGGGCCATGGTGGGTG 3 3GAGG TTGAA 3CGAGC 3CCTG 30AGGCC 3 3TGGGGA 
AT 3 .VG-i C A G A G C G T 3 G 3CAGCCAG 3 33CTGAAG 3GA7CGGTCAGC 3GAGAGCGCTACGTGGA 
GA 3 3CTGGTGGTGG 2'2 3AGA.AGA TGA'TGC T 3G 3-3 3 AT 3AC 3G 3CG 3CGGGATGTGGAGCAG 

30 TA PGTCCTG 3G 3 AT 3ATGAACAT FGTTG 37 AAA 3 3 P7 7 3CA 3 3 AC 3CGAGTCTGGGAAGCA 
CC 3 3TAAC A PGG T 3G7 AA 3T" 3G 3 37 3 A - PGG PG 3 3 3-CG 3AG3A3 3 A G C C C AC T C 7 G G Aj 3 A 7 
CA 3 :CA 3GA 3 GCCGGGAAGT 33 3 3GGA 3 AG 3T F3TGTAAG I 1 3G 3AGAAATGCA PCGTG.XAC 
CA 3AGCGGC 3ATGGOAATGC 3A FTCCAGAGAA 3 37 FG PGG 3 F.AA 3 ■ 3 .A 7 G 3, C A- 3 AG G A G T 3 ■ 3 
3CATCACAC 3C7A7GA 3A 3(3 3' 3 3 Y30PA TAAGAA 3AAA 3CC PG 3GGGAGAGTAGG 3CTGG 3 

35 3GCGGT 3GG 3GGAAT< J T 3 3GAG 3GG 3AGAGAAG 3TGCAG 1 " 3 P JAATGAGGACA3P 3 3GCCTG 
GO 3 \CA 3CG FTGA 30C P PGG 3 3ACGAGA- P ,, 3 , 3-3 | 3 3ACA 3. Y 2 T 3 3GC337G3AC 3ATGAGGGGG 
3GGGAAA 3 A 3C7G 3G= ; 3G 3 3CG 3GG 3GAGGA33"' 33, 3C 337, 3 3 FCATGGCTG 3CCA 3ATTAC 
53 A7GAA 3 AC 55773 3CA PTC 3 P'G 3GG FCATCCCG 35, 3 C 3G7GA 3 3ACATC5,CCAG 3 PT PC PA 
-AC 3CG 3 3C 21 3 3GG 3 FC'F 3 3C FG5YAC.T3C GO 3-3 3CCGAGA 3AGGAC'P'PT 3TGTAGGCGA 

50 CAGTGG 3A 3 3G 3 3 3C50G 3 3 PA 3GATG 3AGAT : 3AG 3AA03 3C 305TCCAGC ACG 3AG PCAA 
A7CGCG 3 3 3 3C 2 3 3AG 3GA 33T 3 3GG 3 37' 3' P 1 "- 3 3C '3AGAG3AA3CGGT533A7C A 30AA 3AG 
CAT GGC 3G 3CG 3 3GA 1 ; 3G 35v 33 3 PG T 3 3CACA 3 305CAC03 3 3G3v3AAGGGGTG 37 3 3 TAG 



■ ■••A/ v.! ji j j\ „ . : .j : j' - 1 _ 1 3 ,', . C' : : 3 [ -VC 3 3'- 3C03A- 33 37 • ' 3- PG 3AC 3G5\GC 3 3 3 3 3 3 3CGG'3 
'-GA- 3' PC 3. -\ PGG 3 3GA 1 3 PG 3 3CC- 3 3' 33, 3 3 7G7GG 33 5C ! 3GCG P- STC 37G'I"PG 57.G 33 3 PC AC 

5- 7GGGAGAGG 3 3 3AGO : 3AA :CA 3Ci3G 3 3GC.7A 3 33, 3 3G3\3 3CCG7 33v3A3i.AA3G 333 3CACC 
GG7GGT3'3AA3A3GGArGAC7G 3Ci3CC 37GGC7 3C 3 AG; 3 ACT 3CAG3V 337v 3 3GCAG P 37TG 
jj/i-wTiS.'. 3AGGa3i ! 3 3 7 7^' 3337G 3 33,A3,30030C3AA 37GGA77A3V3GTA 3CG 3G 3AGGG 
GG ( 337 1 3A3v3 3 3C7G 1 37 2 3 3 3 O^CGGG 3 33AGCGGA,3GGC ^CCAAC 37 3TACACG 3AGAGG 3 
CGGCA3 3C 3 3>3 2'2G^:, 3GG 3A 3 AGC' 3 TGCCGTC' 3 A 3AC AC 3G7GGAGA7TTGCGTC AG7G 5 

41) CG30C3 3 3/G3G 3A 3G7'3 3 : 2-2 3GACC 33,'37CC7GGGC^CC 3ACC7GCGGGAGGA :AA37G 3 
L 3.-iGC J TG l'GG 3G'33 GA' 3 3G 3.AG7GG' 3 3GCG3\G3,' 3 1 3A7CG AGGGGG7G3"3G3-'.GG 3GA3 3G 3 
CACC7G 3G 1 3 3CGGG73, 3 3A3 3A7G7C 37G7-GGA7T 1 3Gi ;3vA\G'3G7' 20 3 3'3GAj3A7'3 T 3 3A3 
C 3A' ;< JA PCT- 3AACCT 37 37'3 3GAG7' 3 AC77GGO 3 3:G3v\GGGAGAGGAG 3AG7GCCTG 37G 
C 7GG.aG 3G» i' 37GGG'P ; t 1 j 1 C-\C 33GGOA'3 ! 3CGG3\GG i 33"C3" 1 (3 | 3 37' 37A'3i 3T<3 2'j2\ i 2C2\'2 , 2 777C 

4^ A.^G'PG ; 3GAGAGGGGG , "J.AGA 1 G 33^3G7'3- "2K>222~jCt.V[- 1 2 3AA.G' 22C r 2G'2^2^2'2 3A77.A.A7G 3 AT- 3 
TGTGAT- 3GTGATG'3'73'3TGG 3GGGGA 3i3GAGCT( 33 1 37GCCCTCC: 3G7AG' 3GGTTGA^A7- 3GG 
CG!3A7GG'3GGG7i3AG7CGG7GGCGGGC73i(37C<37G:3CAC7ATGGG'3i3G P 3GAGGAAG7GG7 
«3Gi3i::GAG7<3TGGAGGCGGTAGCCAGG PGG AGGCCGTGG AG TGGCGCAA7CAAGGTGGAGAG 
GTGGGGGGTGGgV'JGGCAj 3TACTGGA'3TGGGCA< 3 AGC3^A! 3: 3TTGG i 3i 3 AAAAGG.A-AGGGCG 2 

50 CT< JGAAGA | 3GGAGCGTTGGGTCAAGAC7Gf3G7T(3TAGi 3A.AGTGTG 1 3CTGTG | 3AGGGG 3AGC 



CA 02332533 2001 -02-16 



K5 

TTG:GATG?AAGGCGTGCGCAGC_:'jCTC-3GTCGTGTGCCAAGCGCCGCGTCTCTG:C'3CGA 
AGAAAA jG IGCTGGACGACAGCGCATGC 2 2GCAGCCGCGCCCACCTGTACTGA3G rCTGCC 

acgggc:gacttgccctcgggagtggcgggcctcgactggti:tgagtggaccc:caggtgg 

GGGCCGGGCCTCCGCCACCGCGTGGTCCTTTGCAAGAGGGCAGACCACCGCGC 2ACGCTGG 

CCCCGGCGCACTGCTCACCCGCCGCCAAGGCACCGGCCACCATGCGCTGCAAC TTGCGCCG 

CTGCCCCCCGGCCCGCTGGGTGGCTGGCGAGTGGGGTGAGTGCTCTGCACAGTGCijGCGTC 

GGGGAGCGGCAGCGCTCGGTGCGCTGCACCAGCCACACGGGCCAGGCGTCGCAG'GAGTGCA 

CGGAGGi:C':'TGCGGCCGCCGACTAGTAAGCTTCGTCGACCCGGGAATTAATTCCGGACCGG 

TACCTGCAGGCGTACCAGCTTTCCCTATAG 

(SEO; ID NO: 04} 
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FIG. 2B 

•M] 1 KJ-futMcngth + 3..0Rl : I Translation of MP] O-full-lcn-th in frumo -3, ORF 1, 
threshold 50 

^PACQILRWAr.AT,GT.G'iMFEVTHAFRSQDEFL?S'LESYEIAFPTRVDHr-. T GALLAFSPP?P 
F.R^'RRGTGATAESRLFVKVASPSTHFLLNLTRSSF.LLAGHVSVEYWTREGLAWQRAARPHC 
I.YAGHLQGQASS^KVAISTCGGLHGLIVADEEEYLIEPLHGGPKGl-RSPEESGPHVVYKRS 
SLF:4PHLDTACGVF-DEKPWK.GRPWJLRTLKPPPAFPLGNETERGQP'GLKRSV3?.ERYVETL 

1 «' ) V VADKF^AYHGF'.RDVEQ Y VLAIMN I VAKLFQD 3 3 LG ST VN I LVTF.LILLTEI'O PT1EITH 
HAGK3LDS FGKVJQF'.S I^/NHSGH GNAI PENGVAIJHDTAVL ITRYDIC I YK2I7.PCGTLGLAP7 
GGMC2RER SG SVNED I GLATAFT I AH E I G H T FG MX H DG VG N S C G A F . G Q E> P AKL MAAr i I TMK 
ITJ ?F'\/W3SC3RDYITSFIX^3GLlGLCL^ T NFLPPRQIlFWPTVAPGC^VYDADEQCR.FQKl3VKSR 
CLC^^WSEQEQPVHHOC'HPGF.RGFIAVPDAHKRC i CA^/LQTrjLCPI J Vr v \^E , RGCGr.SLGAVDS 

15 i''GE : C3 F:TGGGGVSSSSF:HCFl3PR?TIGGF3^GG<3ERRRHRSC , NTG)DC'PPGSCOFREV'C)CSEF 
DSI F , FRGXFYKWKTYRGGGV:<ACSLTC: j AF.G^ 

KHVGGDR7LGSDLREDKCRV0GGDGS AGKT I EGVFS PAS PGAGYEDVYVG. PXG3VH I F ICO 
LNL S L SHL ALKGDC'E s ll l egl pgt pq pkrlplagttfqlrqg pdqvqs l E ALGP I NAS L I 
VMVLARTEL PALF'-YRFNAP I AF.DS I, PPYFV 'H Y APWTKCS P3 VgAVARGF:F;^3 AATKGDS 3 A 
'/A F H Y G 3 A H 3. K 3 A£- KQ AR I ,QHG A 3 PGOIA^. 'G T 1 V A LQ PQ L AF'QG V R S R S WG Q A P RL C RE E K 
A:..Fj:j3ACPQ?R?PVLR?ATAPLALRSGGPR3V * 



SEC ID NO : 03 ) 
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FI3. 3A 
MP19 

CCGGTTCCTGCCATGCCCGGCGGCCCCAGTCCCCGCAGCCCCGCGCCTTTGCTGCGCCCCC 
TCCTCCTGCTCCTCTGCGCTCTGGCTCCCGGCGCCCCCGGACCCGCACCAGGACGTGCAAC 
5 CGAGGGCCGGGCGGCACTGGACATCGTGCACCCGGTTCGAGTCGACGCGGGGGGCrCCTTC 
CTGTCCTACGAGCTGTGGCCCCGCGCACTGCGCA^AGCGGGATGTATCTGTGCGCCGAGACG 

cc 3ccgcc ftc pacg agc tacaataccgcgg scgcgagc roc 3ct r 3 aacc pg accgcc aa 

T 3 AGC ACCT 3C PGGCGCCCGGC PTTGTGAGC 3 AGACGCGGCG 3 3GCGG 3GGCCTGGGCCGC 
GCGCACATCCGGGCCCACACCC 3GGCCTGCC ACCTGCPPG 3 3GAGG rGCAGGACCCTGAGC 
ID PCGAGGGTG 3 3C PGGCGGCCATCAGCGCCTGCGACGGCC FGAAA3 3TG rGTTCCAGCTCTC 

caacgaggac tac pfcattgagc 3cctggacagtgc :ccc 3 3 3cggcc tggccacgcccag 
:cccatgtg3tgrac.^gcgrcaggccccggagagg:'tggcaca3 3 3gggtgarrccagtg 

CPCCAAGCACCPGTGGAG TGCAAGTGTACCCAGAGC rGGAGCCTCGACGGGA3 3GTTGGGA 

GC AGCGGCA 3 3AG rGGCGGC 3GCCACGGCTGAGGCGTCTACA 3C AG 3GGTCG 3 PC AGC AAA 
I 3 GAGAAG rGGGTGGAGACCCTGG FGG7AGC rG ATGCC AAAATGGTGGAG TACCA 3GGACAGC 
3GCAGG rT3AGAGC , TAT'3TGCTGACCATCATGAACATGG , I , GGCTGGCCTGTTrCATGACCC 
CAGCATTG 3GAACCCCATCCAC ATCACC ATTGTGCGCCTGGTCC PGC PGGAA 3ATGAGG A3 
3AGGACOTAAAGATCACG 3ACCATGCAGACAGV3ACCCCGAAGAGCTTCTGC AA3TGGCAGA 
AAAGCA FCAACATGAAGGGGGAFGCCC ATCCCCTGCAC 3ATGACAC TGCCATCCTGCTCA 3 
20 3AGAAAGGACCTGTGTGCAACCATGAACCGGCCCTGTGAGACCCTGGCACTO FCCCATGTG 
G:GG>3CATGTGCCAGCCG 3ACCGCAGCTG 3AGCATCAACGAGGACACG 3 3CCPGCCGCTGG 

3 c ftca :tgtagcccacgagctcgggca 3agttttggcactc agcatgacg saagcggcaa 

TGA3TG r 3AGGG 3GTT(3 3GAAA 3GACCT PTCATC\ATGTC?CCACAGCTCC 73 FACGACG 3C 
GCTCCCCTCACC rGGTC C3GCPGCAGCCGCCAGTATATCACCAGGTTCCT rGACCGTGGGT 

23 GGGGCCTGTG 3C7GGAC 3AGG 3FCC7GC 3 AA< 3 G A C A ' I'''! 1 A T G G AC T T C C C C T 3GGTGCCA 3C 
TGG 3GTCCTC FA 3GATG TAAG 3CACC AG rGCCGCCTCCAG FACGGGGC 3 FA 3 FG FG 3C7 FC 
7GCGAG 3ACATGGATAAPG V 3 FGCC ACFi 3AC7CTGGCGC FC7G7GGG 3ACCA 3C73TCA 3'" 
CCAAGC 33 G A rG2AGCC 3TGGAGGGIAA3CCGGTGTGGGGAGAA7AAG F3GT3 FG FCAG TGG 
OGAGT 3 3GTA 2G 3G7GGGCTTCCGGCCCG3^^3GCGTGGA3'GGTGGCTG 3T 3TGGCTGGAGC 

M i GCCTGG PCCA FG TGCTCACGGAGCTGT GGCACGGGCG7ACAGAGCGCC 3 A 3 3GGCAGTG 3 A 
GGCAGC 3TA 3GG 3 1 3 A . AA TAG AAA G G C A G A 7 A C F G 7 G FGGGTGAGCGCAAGC 3 37 FCCGC 37 
C7GCAA 3C7 3 3 AGGCCTGCCCFG* ; TG G C C' 3( ICC C 7 C C V FCCGCCACG F 3CA 3 FGCAGCCAC 

GAGG7 FG7CGAGGC7GCCAAC77CCTGGGACTGCGGAGCGAGGAGCCGGA 3 AA3T AG 7 TGC 
■t< 1 TC3\A/FGGTGGCTGGACCATCC AGCGG A7*j3GGGG AC TAG* 2 AG GTG G (3 A 3GGACCACCTCCAC 
ATACG C ACGC AGGGGC AAC TGGG AG AAC C7C ACGT C C C< 2GGG TCC C AC 3 AAGGAGCCTGTC 
TGG ATCCAGCTGCTGTTCCAGG AG AGC AACCC7GGGG7GCAC7AC3 A 3 r AC ACCATC C AC A 
GGGAGGCAGG7GGCCACGACGACGTCCCGCGGCCCG F3T7GTGC7GG 3A77A7GGGCCG7G 
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ACOGGTGTGGCOTGTGACGAGGCCGAGGAGCCAGGCAGCGAAGTCACGTGOTGTCTGCCAO 



TGTGT 3GGTGGGCCCTG 3GGACAGTGGGOGGTGAAGGGTGAGGG 


AG 


3GG0TC0T0CAGCCA 


CGAG 3TOTTGAAOGAG 3 3TGA0TT0.ATG00G0AGGA 30TGGGOC 


3.A 




TO AT 3AOCCAAGGGAG 3GAGCATGGGGAACGGCATTGAGGAGG.AGG 


3TCCAGAGCTGGACC 


To-- --j'j'j'j-'-C'j T'jTT i-j V 5'jACgAa. PI ? a A- TAG G A _7.Ai_AAT 




3AT0.AAT TTOOAOGA 


G j.iT _T'jT _ - - A'^'j'jG _ T.TGA'j'j.Vj — 'JjATlTA'jAL'-T'j'j 


3G 


J 1 J [ _J A F _ A [ _J ! j' J r jA ! _ ! j, ; J7 JJ 


AO AOOO 30AOCACACA 50 5GTOOT- .50 rGOGOOOTOOAOGGGTAG 




3TGTGOOTGO'3AOAG 


A [ j J ^ l 1 1 A'j'^'.. AA J J-jA J [ J 1 j'jf J j. A i Ij'jl jA j. Jj'jI UL 




3AGO'30TTGGOOTAG 


*\ l"'l" , ." l l" l l "'r"'." , ^ 1 l' , ]' <, T l l■ , ■' , .■ ' l ft "ft-v-i.-!,^" \ p ft. i~i ft i -1 ,^ 1 a .~> ^ '""."^ "'."iti.' 1 
'A-, A' j'j^L'j'j A.'j 1 ^ i A ^'^'..'.J-. A-- 1 - - 1 _.-VjAIj , _A'_3.-i. , _ 1 1 j 


3G 


-AAOOOTTTGATO.AAT 


TTOOTGOO'7 3AGGAAGA 3A 3GGCGATAG 3GG 3CGGAGATG FOGG 




rOOGOAGOO 3G TOOT 


.-i -i.-i -i A 1 -i|~ i r '~ , -^'"Ti--pi"',"' v^p 1 ft-p ~ , ,~',~'i~'"p 1 ~' , .~ i ft "1 ft r* ft , ~« j ~> ~r. . , , ft,~i "',"1 


ft 

j.-i. 


3 AG 0 0 AAAA.TG ATT T 


A'jr 1 i'j J ..^'j'j.H-A j ^'^.I'jH'j „^, J . 1 ( j'-'A^'-.-A- 1 1 1 j'j'_. 


3G 


5 AO A< 3 G A 0 0 AA TG A G 


[ j r rr : _.aa 3 j.^i'jAT'jAj'jIA-.'.-l.a.^j'jj'-l'j'Jj'jA'j'-A'. a 




rGGGCGGG.AGAGOO.A 




"Pi" 1 


T'-CTA'jT _0 PGAOG P 


1 j'j'- 1 j' j.A 1 j'„ r jt'A.jGA'.AA j 5 .A j jOAO.aG rGGOOTGGGAGO AA3 Zl 


ro 


3.AGGGTGG0 3TGGG'3 


[-1 ^''Pi^T'^i^i ?1 l" irn .'^l A P" 1 P -'i" 1 '^I'^Jl'^Ti^'P'^l^r" 1 :" 1 !" 1 ^'" 1 ^ " , ' T 1^ ^P" 1 P 

_ 1 _t 1 j.n _ ,-i'.T ^ _iAn _ i j j. j j 1 .- A _ i 1 j 1 i 'j'j'j 1 ^ 1 j j 


— 


rTOOTGOTOOOATAG 


p ~i p.-" ~i -> ft , -1 >\ -1 ft. p. "1 ft ft -i, — 1.-. ft-i.-i -1 ft.~i p "i'"p'pr',"^.~- p -1 "1 ft." 1 ~^ _ i -1 


3G 


3AO 5000 T 3COT0G0 


ft,-i,"',-i -i-> ft.-' "ft ft . ""',"'' ft , -i.-i.-i p, ■ "t ft p-i,'1 -ift,'ft."i p-i "ift -ift.-irp.-i-p-ii "Ift.-i n , -1 ■ ti 


3G 


3GG.A00TT00T0000 


AO.AAOO'O TO AOTGGCC TO 5 3G0A0ATG0 r TGAGOO V ^OOOT AAA 


50 


3 AG GAG 3 3-AAGGGTO 


AGOOTGAG V ^:::0T0AG0O ~: PGAGG PG00 TO TGAG 0r0 rAijG CP 3 


3T 


5TO 3AOA3 3A330TT 3 


G'jAC.AGCCC "^'-:j'J0.AA0AG HO AOAiG.A 1 .? V 2 2 2 TGAGAO JC.AGC JG 3 


3G 


SOTG'OO'AG 3 3 5GG0T 


G A-A'JO'.jGGG ^'J'JOO'.^GGG jA 2'2 JG i'TG 2 Y V jTOAG J.A.A0GO0.A3 


3T 


3G0.AAG0GG 3.AAA0 2 


GG.AGOGAGT 30 TOT AGO A OJTGTGGCOT 3 2 3TG0GG rCTGG.AG 2 


30 


5GTGOGO FG TAGOT : 5 


OGGO-I'G^jGAOCJAGG AO TG 2-2 2'2 300G0 Y22 Ol'GGOOO AAGO 3TG 


30 


3G00G0TG 30A0GTG 


0GG00 OTGT j'30A00T 33 ^A3T '.A\5G0-AA 2 rGGAGT AAO-T 1 3 3T0 




50.AG0T'G0 3 3 5GGAG 


1 t i'A' , ^ , _i ro.A : j i '.jOGG'j a 2' j 2 2 OA 3VGT 3T 2 5 .AO AO AOGGG.AO JTO 




30CACTGG 3 3 300T 2 


1 ™- - A ■ jT _ j'_. '_ _ ( j'/j'.j r ^ 5 J'.AAaG'J 3G 3 rGOGC AG 33G 2 2 : 2 T 


"tO 


5GG 5000.AG 3 30TG 5 


0 rOAGOTOG 7AO AOATOTT : 2'? 3GAG 3GA jTGOTOOGAGG : 3T-3 


TG 


jO'j-jrTG'jT'jr.A 50.AG 3 


A'j: 1 J 1 jT : JT.^j j ^ '5 A TO' - GO 00 2 3 A 500 A 3G0 3 POTGOG A 3' 5. A 1 3 5 5G 




3 AG Aj 3 '3 0 A V 3 AO 0 AO 


0 3GGCCCOG0AAOAO0 3A0 2 30 TGCA 3G0AGTGGGT 3 3 T'3G 3 5 2 




FGGGGO 3 AG 2 30TO.A 


goooootG'TGGtgg :g -it ;r 30 agog 3 300 3TG'3to.aa 3tg ro v 




A< 5 .A ' 5 0 0 AG .AO AGG G 0 


A' 1 .7 0 0 1 3 1 5 a G G .A A 1 5 AO AG ' PG AG 5A5A'G'P 5G0 J.AOGAGG'J 3TGG 50 P 


3 A 


.jAj''.- 5 '_. _ ! ^'.^ .7 ..'J.jT'i 


A^JGOAOOGAGG.ATT^; OGAG 3 30 30 3GAG 5 3 30000G 3 TGTG A 3 3 


5G 


5AGCGC 3T 3 5 30 , 3TO 


G i 3GTTOOG;3G.AG.AO , 30 i rGO > 5 30 TAG 1 !'' 3G 30 3G0TG0C.AG0T 30 3 




30AT0030.AOOOAGT 


GOOG'-OCGGT 3GOGOOOTi3i3 3 30 3AG00A0G 5CGCGCCGTCC 3 3A 




- A'^ '_."1. I _7 I _ J j'j'. F'j'j. 


| 3CG00G'5TG V5TGOGCCAG 3ATG0 AOAGAO 3GA00G.A 3AGA00 V 




3TGC00A0 3 A 3GGG0 


OGTGGCGGAGGG 300G0i3O0 3TG0GO0OT.AAT GGTG 3T.AACOO 3 




3T0.ACTAC 3 3 AGO AG 


CAGGCA'GGGOAC 3T , .3CTCOC 30T0 A.AAAA.AGG TATTTTT'TT AT T 




aAC.A'vj'TT'F j TijTAAC 


A TOO -AT T -A TO .A T T T T A 1 5 A T .AAA T ( 3 AG 0 A T 1 3 T A 0 0 .A T T' 0 0 .AAA 1 3 5 




.AAA^AAAAAJOAAA.A.A 


.AAA A A A A A.AAAAAA^AA.AA.AAAA„AAAAA.AAA.A { SEQ ID N 0 : 


-\ 


6 ) 



OP 
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FIG 3B 

■'MlM9-full-]L*n^th+]J.)R] ; l Translation of MPiy-tull-U:n«th in tVaniL- + 1 , 0 N V 1, 
threshold 50 

3 

PVPAl^PGGPSPRSPAPLLRPLLLLLCALAPGAPGPAPGRATEGRAALDIVHPVRVDAGGSF 
LS YELWPRALRKPDVSVRRDAPAFYELQYRGRELRFNLTAIJQKLLAPGFVSETRRRGGLGR 
AHIFAHTPACHLI,GEVQDPELEGGLAAISACDGL?:GVFQL5JNEDY?IEPLDSAPARPGHA0 
PH\ p /YKF.QAPERLAQRGDSSAPSTCGVQVY PELEPRRERWHQRQQWRRPRLRFLjHQRSVSE 

10 F K W 1 / E r P L "/V ADA]- 'RTV E Y H G Q ? Q V E S Y V L T 1 1 -1NMV AG L F H D ? S I G N P I H I T I V R L V L L S D E E 
SDLKITHHADNTI'KSFCKV^KSrmKGDAHPLHHDTAILLTRKDLCATI^^RPCETLGLSHV 
AGMi^QPHRSCSIJIEDTGLPLAFTVAlIELGHl-iFGIQHDGSGIJDCEPVGKRPFIKSPQLLYDA 
APLTWSPGSRQYITRFLDRGWGLCLPDPPAKDI I DFPSVPPGVLYDVSKOCRLQYGAYS AF 
CEDHDIJVCHTLWCSVGTTCKSK^ 

1 3 AWS I C SP.SCGMG VQ SAERmCTQ PTPKY KGK Y CVG ERKRF RLCriLC AC PA( 'JR P S F RHVQCS H 
FlLAHLYKGP^HTWVPVWDVNPCELHCRP^ 
ICKJr/GODFEIDSGAMEDRCG^^^ 

E YAE AAN F L ALRS E D PER Y r LMGGWT I C'WK' YQ VAGTT FT YARRGNWE:' J LT S PG PTK E P V 
W 7. Q I ,Tj F<jE.?N PG VH Y E YT I riR E AGGH DE VP P PVFSWK Y G P WTKCTVTCGRGVQEQNVYC LE 

JO R^AGPYLiEEKCDPLGRPDDQQRKCSEvPC^^^^ I RSY 

G : .PEQ3 ALEP PAC SHL PR P PTEf PCI JF.HVP :TATV. r A7GNV/S'jC£Y': , ';:GFG , : i QR?^r/Li::TN:) 
TGVPCDEAQQPASEYTCSLPLCRV/PL^TLG PSGSGSGSSSHELFKEADF I PHHLAPRPSPA 
S = PK PGYMGMAI EEEA PEL3LP 3PV1 'VjJr fYDYMF E M FHEI il , 5-YGPSE E PDLDLAG'I^DR 
TPPPHSRP^PSTGSPVPArEPPAArCEEGVoGPWSPSPl^PSQAGRSPPPPSEQlTGNE'^iri 

J3 F L P E ED':' ? I G A PL! L-G L P : : J E 3 W PR\ 'STPG : r PATPE'SQMDF P VGKD 3QS0 LPP PWRDETX; : 
YFR^DEEPKGRGAPHLPPRPSSTLPPLSPV^STHSSPSF'Ij'AAEL.V.T SGT^ Ya'EPALEOGLG 
Paj:-;EIY/:PTYGVASLL?PP:A?LPEME\^ 

T TLi TG Ia : HX P E PA LN PG P K 3 } P E 5; l S PE Y PL-SSREL dT P A'.\T>S PAN8KRV PETQ PI. A PS LA 
EAGPPAl'PIAVmASWQAG^^ 
.M) R PC ATW: • SG.IWS KCSRSC 3 3 3S S VRD'Oj 3VDTRDLEPLRPFKCQ P 3 PAK P PAKR PGGAQ PC 
[ . SVTYTS FWREJCS EACGGGE ^ORLVTC P :! PG EC EE APR PR'TT R PCNTK ? 3 TQV/VYG P V73QC S 

A P C 3 G G \ * Q R R L V K C VN T Q T G L P E E D S 1 1 Q 3 G H E A', "J P E S S R P C G T E D C E ? V E P P R C E R G R L S F 
: Ft 'FTRR LEG ECO LPT I. R T 0 3 C R S f ' : S P P S H G A P S R G H O P ' /' A P ? * 

M'l.l ID NO:03 : 
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Fit;. * * a 

IIP- 20 

GGTCGTGGTGCTGiJA(3TTTAAGT'r(JAGTA'jTAG<"AATGC':i'.iTA'lTAGTTA(:;GATAATAT7^ 
5 ATAGTAAV^APTAAGAATGGTTATGTTAGGGTTGTACGG^ 

^GTGGGT2GV^TG3iGG3TjT3iTGCT2G>/jA3rTT2 1 GC l 2T AG* 2P 1 ? jGTTTG' 3TTTAAT : 3 ( J ACOTOA 

agt'.;' :< :tgc2aat< ;at< j« } a' : aag attg ag a> 3ag?« 3 ag g a 3 aa< :;g- :?tacgttt agtg aggg a 

G AG 2iT T TG G T A T A T G A T 7 G AG AT G G G G G C '7 A G 7 ' P T T T G T i 3 A 7 G 7 G A< 3 AAG AA G 22AGG' :GGG 

AT' )toagaggggtgg' :■??< ;< ;< ;taagc2 i ctgg< aactcagat^gtg;^aagggggcta7tcctag 
io ttttattG' :t at aggg a' r ' r a' pg a 1 p 1 r a 1 r' :• aat ;atgagtatt aa ttggiaagtat pggttatg 

GTTGATTG P 3CGGAGAGTATATT' 3TTGAAGAGGATAGCTATTAGAAGGATTATGGATGCGG 

ttagttgg- ; pgaggat^ytpigt ?gatggcagct p g t g t g g aa.- r g ag g g t t t a t t f ptttggt 

TAGAACTG JAATAAAAGCTA' 3 3ATGTTTA77TC FAGGCC7AC7C;v3G7AAAGVYA7CAGTGC 
GAGGTTAG 3 )GT< jTGA PGA' ;T3T' 5GG PGC AAAG ATi 3GT AG A 3 TAG AT 1 AACGGG 3GAGGGGT 

1 5 ggg/^agcagcatggag pttgtat gctgggccagacggctaa :ggt:otggtgcgG'.;a .:gt< i 
gccgagatggggaggcgagag 3C ;g-:«';gi:g< ;co 3tggggaa 3G2^caggctgca2ccgaggc 

AAGTGAAATTA77AGAGAC 3 : FGPiG 3GAA72iCGAAiA7CGTGTC pc::catccgagtgaacgg 
7CTCGGAGAACCCTTTGCGA ; 3AA< > 5' L 1 ' "'.'A 1 ,TT 3 AAAAG/GV 2G' 2< 3ACGG A< 3C ATTAACTCT 
G 2C;vCTGACCCGTGGCGTGCG7TC< 5 2CTCC7CCTCTTCC7 , CGT ':TAG 2TCC7CCCAG 3C 3C 
20 A PTACCG :GTOTG PGCCTTC 3 3GCAG AAGTTT' 2 TAT' TTAATGTGACC 3CGAA7: ; : :g ;ATT 
TATGGGT -GAG PG P7CACTGTCA "C 27 3CTCGG jA 3G 2CCGGGGTGAA2T 2AGAC 3AAG7TT 
TATTGCGAA 3AGGAAG' > 5GAA 37GAAG :AGTGT ?T 2TACAAAGGCTATi AP ; :aA7A 2CA\CT 

g :gagga:a :gggggt :atcagcct2T3 2tgag 3aatgctgggcacattgcggtgtgatga 
t3ggga'f7atttta 27gaaccac pa jag 3c7at 3gatg aacaagaa 3 atgaagagg aa 2aa 

25 AACAAACCCCACATGA P7TA TAG' 30 5 2A30GC0 :G 2CAGAGAGAGC 2GTOAACA jGAGV 3G 2 
A TGGATGTGAi AAOGTO \GA2v2AC AYAAATAGGOA 2AG PAAAGAC AAG \AG AAAACC AG AG 2 
3G\GAAAAT' 3GGGAGAA-AG' 3 A i 7aT\0 JTGGOaGG PGACG 3AGGAGC--i. - A\.v. 3.-- 1 3CG 2 2 i'A. 

G 2 AACAG AGGOATTT rGTGG PTATG 3TAATAAGA 3G' ;AGAA 2AG.AAG AGAAAAGAGGACC 2 
A 2 A 1 AAAGGACAAAACG P'F 2 2 2TA PC 2TA rC 2A2GGT , 2 2GTAGAAGTGTTGG TGGTGG 2 AG A 
30 OA\ AA3AATGCTTTCA PA 2 2ATG 3AGAAAA 2G PTOAA 2A 2TATATTT rAACTPTAATGTCA 
A TO 3TA ;CC PCTATG L2V2.AAAGA 2GGAAGTAT 2GG.AA\T 2T.VA 2 r.AA TA ["T 2G 2 A 2T2TGA 
2,2T'[\AA2TG 2GATTGA 2A \ 2G.AAG A 2GATG 2 2 2GTT2:GA PAT 2 2 PTTAA [V; 2 2GAGA 2AA 2 

A22TGG T 2T 2 2 V PGTTA„\ 2A^\ 2-AGAGGATATG T 2GAGAGGTGA 2 2A2AAAT 2TGA i'AGGTTA 2 

35 G 2GTGG 2T 2AAGTGG2.AA2 2 ATTTGTG.ATG 2GTATA 22-vAG 2 2 2 P 2GTATTA TPGAAGATAG 
TGGATT-2A 2 PAGAGGT PT PAGGATGGGGGA PGAGGTGGG 2 2 \ 2 3 IGTTTAAGATGGGTGAT 
GATGA.G.AA 2A_A ZAAAT 5T.VA\G.A\ 2 .A-\i 2 GAj 2 T L2AAG A 2 P 2 2 2 2A 2CATGTGA7 5GGTGG.AA 
GAGTGAAI2 P PG PAGA 2 2AA 2 2GGTGGATG TGG P 2 AAA 2TGTAGT 2GA^AAATATAT 2AGTGA 
GTTTTTAGA 2AGTGG T T.\ V 2 2GG AGTGTTTGG T 7AAGGAAG 2 T 2.AATGGAGAGG 2TAGGGT 

40 TTGGGTGT 2 2AAGTG 2 2A 2 2 2A7GGTTTA 2.AA 2 ^TGAAT.AA.A2 AATGTGAATTGATTT P PG 
GAGGAGGTTGTGAGG 2GT 2 2GGATATATGATGGAGTGGA 2AGG 2 2T 2TGG TGG2lATA_AGGT 
GAATGGA'2 rA 2 AG AA AG 2 2T 3GGGGAGT 2AGGA 2AGAGGG TGG 2 2GGATGGGAG 2GA 2TGG 
GAGGGT 3GAAAGGA 2 7GGAA 2TATGGATTT PGT 2TTGGG AAAG.AAA ?GG.\?G~C 2CCGTGA 
GAGATG 2A V 2G TG 2 2GAAGTTGGAG P 2GGTTTG 3AAGGTGGTGGAGAAGATGTGGAGGGGG 

45 G ATGAAAA 2AG 2GA T TGGAGAGTG 2A.AGAGAGG.A 3 AAGG2C^A.AAAT 3GTG 5A„AAATA. 2 PGT 
GTAGGACG PAGAAT 2 AAA FTTAA 3rG , GTGGAAGAGGGAGGGATGTGTGA^AGGA 3.AA 3 2 3 AG 
AGTTGGGA 3 AT 3.AA 3AGTGTGGT 2 A 2 TT TG AC G G G AAG C AT TTT AA 2 AT ■ 2 AAC G G TG FG22T 
TC rC.AATGT 3G 3GTGGGTCCCTC AATACAGTGGAATTCTGATGAAG 3A 3 3GGTGCA.AGTTG 
TTCTGCAG AGTGGCAGGG AACA 2 AG 2 G T AC TAT C AG C TT G G AG AG A 3 AG T 3ATAG ATG AAA 

30 GTC 2JTTGT3GCCAGGACACAAAT 3AATAT 3TGTGTCC.AGGGCCTTTGGGGG 3AAGGTGGATG 
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CGATCATGTTTTAAACTCAAAAGCCCGGAGAGATAAATGTGGGGTTTGTGGTGGCGATAAT 
TGTTCATGCAAAACAGTGGCAGGAACATTTAATACAGTACATTATGGTTACAATACTGTGG 
TCCGAATTOCAGCTGGTGCTACCAATATTGATGTGCGGCAGCACAGTTTCTCAGGGGAAAC 
AGACGATGACAACTACTTAGCTTTATCAAGCAGTAAAGGTGAATTCTTGCTAAATGGAAAC 
5 TTTGTTGTCACAATGGCCAAAAGGGAAATTCGCATTGGGAATGCTGTGGTAGAGTACAGTG 
GGTCGGAt^ACTGCCGTAGAAAGA,^TTA.\CTCA. Q L CAGATCGCATTGAGCA.^GAACTTTTGCT 
TGAGGTTTTGTCGGTGGGAAAGTTGTACAACCCCGAT'3TACGCTATTCTTTCA.\TATTCCA 
ATTGAAGATAAACCTCAGCAGTTTTACTGGAACAGTCATGGGCCATGGCA-^GCATGCAGTA 
AACCCTGCCAAGGGGAACGGAAACGAAAACTTGTTTGCACCAGGGA^^ 
10 TGTTTCTGATC^^GATGCGATCGGCTGCCCCAGCCTGGACAIATTACTGAACCCTGTGGT 
A!^AGACTGTGACCTGAGGTGGGCCACTGTTTTCTCA.AjGCCrrTAT.A-\,^TGAATTGTGAGA 
GrGrTGGAGGAGGTGCCAGCAGGAGAAGC.^AAAGGAGGGGArGCCGGTGTTTAGTTCCCCT 

t r-:; r pgtg ?ttcagtgaaataagctttaaccaattct "catccctc tc^ aactgattatcc 

A A 3A-ATACATGTGGAGATTTCTTGTTCACCTAAGA^rTAA.^^TAGCTA.^TAGAGTATGG 
1 5 C ACT rGCCAAAAAAAAT I^AGTTGATCCTCACMACTTGCTG 3GTAGGCATTAGCATTATGA 
TTGA 3TCACATTGTACGTGAAAACTTGTTTTGAAAGTCAAAJ^jAA^^^^ 

TC CCTCAAAGTACCCATAATGACCTATATCTACCGAGAGTG r AT A C C AC C C AG TAG AAG AA 
C T C C T AC A C AC C TG A A AG rTGCAGTACACTAAGGTAGCGTCATGGAAGAAACAAGAAGAAA 
ATG T A T A T A TG G AT G T Y T j AG AT A T T C . AAA C AAT T C TG T G T T T AAG AJ\AA AAAAAAAAAAA 
2{) AAAAAAAAAAAAAAAGTACTCTGCGTTGTTAC : ACTGCTTG 2 "CTATAGT 3AGTCGTATT 
(SEQ IU NO : OB ) 
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FIG. 4B 

. ■ Y PiO-full-lenyth- 1...0K F I Tun-lation of MP2()-tull-Jongth in tV:imc + 1 , OIU-" 1, 
Lhi cshul J S[) 

5 

mqfvswatlltllvrdlaemgspi'aaaavpj<drlhprqvklletlseyeivspirvtialgs 
??ptwkfkrtrrsinsatd?wpafasssssstssqahyf.lsafgoqflfwltai:agfiap 
lftvtllgtpgvnqtj:fyseekaelkhcfy:-:gyv:jtnsehtavislcsgmlgtff.s:--idgdy 
f i e p l q smd e q e d e e e q i jk pk z i y ?.r 3 a p (j re p stgrk ac dt s e h kx rhs kdkkxtrarxw 

If) G E R I X E A G L > V AA EX S G L A T E A F S 7-. Y G X K T DX T RE KRT H R RT K R F L 3 Y ? R F V E V L WA D X FLM 
VS YHGENLQHY I LTLMS I VAS I YKDP 3 1 GI IE H J I V I VNL I VI HN EQDG PS I 3 FXAQTTEXN 
FOQWiOHSKXS PGG I HHDTAVLLTKC'D ICPAHDKCDTLGLAELGTI 2DPYRSCEI 3EDSGLS 
TAFT I AH E LGKVFNM ? HDDXXKC K. E EGVE 3 P Q H VMA PTLNFYTN P VJ>M S E ' J S REY ITEFLD 
TGYGFGLEXEPESRPYPLPyOLPOILYIJVI^QCEX 

15 KKGCRTQHTPWADGTECEPGKHCrTGFCVPF.EM^^^ 

a i r eg r ;r ? e p Ki jg 1 :< y c vg r pj-ik f e s c xt e f c i , kqkrd f rd e q c ah f dg k h Fr ; 1 1 ig e l px v 

RWYPO YSGIL MKDEC ELF C R VAG f .''3 'AY Y Q 1 , R D RV I DG TPGGQDTNDI C VC/GLC Pjj AG C D H V 
XNSKAERDXCGVCGGDNS SCKTVAGTEXTVHYGYXTVVR I PAGATNI EVRQHSFSGETDDD 
NYEAL 1 SSSr.GEFLLrJGNFW'l , I-IA};REIRvIG:JA\"/EYSGSSTAVERIN3TL'RIECiEELLQVL 
20 S VG K EYK P I ) V R Y S FX I ? I E D K ?Q G F YXX S H G P WQ AC SKPOQGE RKR K L VC T RE E DQ L TV S D 

Q R C D R E P Q P G H I T E PC G T D C D E RV ■/ A T V F S R P I. .■ 
i SEQ ID XC : 07 ) 



